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(57) Abstract 

A Protein Complementation Assay/Universal 
Reporter System (PCA/URS) for detecting and 
screening for agonists and antagonists of a cellular 
receptor is described. The invention is illustrated by 
the example of murine dihydrofolate reductase 
(DHFR). Fusion peptides consisting of N- and C 
terminal fragments of murine DHFR fused to GCN4 
leucine zipper sequences were coexpressed in 
Escherichia coli grown in minimal medium, where the 
endogenous DHFR activity was inhibited with 
trimethoprim. Compression of the complementary 
fiision products restored colony formation. Survival 
only occured when both DHFR fragments were 
present and contained leucine-zipper forming 
sequences, demonstrating that reconstitution of 
enzyme activity requires assistance of leucine zipper 
formation. This assay could be used to study 
equilibrium and kinetic aspects of various molectlar 
interactions. 



Transfectlon/ 
Transformation 



T 




SUBSTRATE PRODUCT 




COLOROMETnC 
R_UOROMETOC 
SURVIVAL 



Transfectlon/ 
Transformation 



T 



/hosA 




SUBSTRATE PRODUCT 



COLOROMETHC 
aUOROWETWC 
SURVIVAL 



•(Referred to in PCT Gazette No. 30/2000, Section II) 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under fee PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


n 


FmUnd 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senega! 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


SZ 


Swaziland 


AZ 


Aierbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 
Belgium 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


IT 
UA 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Centra) African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KB 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 
Zimbabwe 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


a 


CCiic. d' I voire 


KP 


Democratic Peopfc'i 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Crech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


U 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 




LR 


Liberia 


SG 


Singapore 







PCT/CA99/00702 

WO 00/07038 



TITLE OF THE INVENTION 

PROTEIN FRAGMENT COMPLEMENTATION ASSAYS 



5 FIFinDFTHE INVENTION 

The present invention relates to the determination of the 
function of novel gene products. The invention further relates to Protein 
fragment Complementation Assays (PCA). PCAs allow for the detection of a 
wide variety of types of protein-protein, protein-RNA, protein-DNA. Protein- 
10 carbohydrate or protein-small organic molecule interactions in different celluar 
contexts appropriate to the study of such interactions. 

RACKGROUND OF THF INVENTION 

Many processes in biology, including transcription, translation, 

1 5 and metabolic or signal transduction pathways, are mediated by non-covalently- 
associated multienzyme complexes 1 ,0 \ The formation of multiprotein or protein- 
nucleic acid complexes produce the most efficient chemical machinery. Much 
of modem biological research is concerned with identifying proteins involved in 
cellular processes, determining their functions and how, when, and where they 

20 interact with other proteins involved in specific pathways. Further, with rapid 
advances in genome sequencing projects there is a need to develop strategies 
to define "protein linkage maps", detailed inventories of protein interactions that 
make up functional assemblies of proteins 23 . Despite the importance of 
understanding protein assembly in biological processes, there are few 

25 convenient methods for studying protein-protein interactions in vn& s . 
Approaches include the use of chemical crosslinking reagents and resonance 
energy transfer between dye-coupled proteins' 02 ,M . A powerful and commonly 
used strategy, the yeast two-hybrid system, is used to identify novel protein- 
protein interactions and to examine the amino acid determinants of specific 

30 protein interactions 46 " 8 . The approach allows for rapid screening of a large 
number of clones, including cDNA libraries. Limitations of this technique indude 
the fact that the interaction must occur in a specific context (the nucleus ofS. 
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cerevisiae), and generally cannot be used to distinguish induced versus 
constitutive interactions. 

Recently, a novel strategy for detecting protein-protein 
interactions has been demonstrated by Johnsson and Varshavsky 108 called the 
5 ubiquitin-based split protein sensor (USPS) 9 . The strategy is based on cleavage 
of proteins with A/-terminal fusions to ubiquitin by cytosolic proteases 
(ubiquitinases) that recognize its tertiary structure. The strategy depends cn the 
reassembly of the tertiary structure of the protein ubiquitin from complementary 
A/- and C-terminal fragments and crucially, on the augmentation of this 
1 0 reassembly by oligomerization domains fused to these fragments. Reassembly 
is detected as specific proteolysis of the assembled product by cytosolic 
proteases (ubiquitinases). The authors demonstrated that a fusion of a reporter 
protein-ubiquitin C-terminal fragment could also be cleaved by ubiquitinases, but 
only if co-expressed with an AMerminal fragment of ubiquitin that was 
15 complementary to the C-terminal fragment. The reconstitution of observable 
ubiquitinase activity only occurred if the A/- and C-terminal fragments were bound 
through GCN4 leucine zippers 109 The authors suggested that this "split-gentf 
strategy could be used as an in vivo assay of protein-protein interactions and 
analysis of protein assembly kinetics in cells. Unfortunately, this strategy requires 
20 additional cellular factors (in this case ubiquitinases) and the detection method 
does not lend itself to high-throughput screening of cDNA libraries. 

Rossi, F., C. A. Charlton, and H. M. Blau (1997) Proc. Nat. 
Acad. Sci. (USA) 94, 8405-8410) have reported an assay based on the classical 
complementation of a and co fragments of P-galactosidase (p-gal) and induction 
25 of complementation by induced oligomerization of the proteins FKBP12 and the 
mamalian target of rapamycin by rapamycin in transected C2C12 myoblast cell 
lines. Reconstitution of b-gal activity is detected using substrate fluorescein di-0- 
D-galactopyranoside using several fluorecence detection assays. While this 
assay bears some resemblance to the present invention, there are several 
30 significant distinguishing differences. First, this particular complementation 
approach has been used for over thirty years in a vast number of applications 
including the detection of protein-protein interactions. Krevolin. M. and D. Kates 
(1993) U.S. Patent No. 5.362,625) teaches the use of this complementation to 
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detect protein-protein interactions. Also achievement of p-gal complementation 
in mamalian cells has previously been reported (Moosmann, P. and S. Rusconi 
(1996) Nucl. Acids Res. 24, 1171-1172). 

As in the USPS, the yeast-two hybrid strategy requires 
5 additional cellular machinery for detection that exist only in specific cellular 
compartments. There is therefore a need for a detection system which ises the 
reconstitution of a specific enzyme activity from fragments as the assay itself, 
without the requirement for other proteins for the detection of the activity. 
Preferably, the assay would involve an oligomerization-assisted 
0 complementation of fragments of monomelic or multimeric enzymes that require 
no other proteins for the detection of their activity. Furthermore, if the structure 
of an enzyme were known it would be possible to design fragments of the 
enzyme to ensure that the reassembled fragments would be active and to 
introduce mutations to alter the stringency of detection of reassembly. However, 

15 knowledge of structure should not be a prerequesite to the design of 
complementing fragments. The flexibility allowed in the design of such an 
approach would make it applicable to situations where other detection systems 
may not be suitable. 

Recent advances in human genomics research has led to 

20 rapid progress in the identification of novel genes. In applications to biological 
and pharmaceutical research, there is now the pressing need to determine the 
functions of novel gene products; for example, for genes shown to be involved 
in disease phenotypes. It is in addressing questions of function where genomics- 
based pharmaceutical research becomes bogged down. There is therefore the 

25 need for advances in the development of simple and automatable functional 
assays. A first step in defining the function of a novel gene is to determine its 
interactions with other gene products in an appropriate context; that is, since 
proteins make specific interactions with other proteins or otherbiopolymers as 
part of functional assemblies, an appropriate way to examine the function ofa 

30 novel gene is to determine its physical relationships with the products of other 
genes. 

Screening techniques for protein interactions, such as the 
yeast "two-hybrid" system, have transformed molecular bidogy, but can only be 



SUBSTITUTE SHEET (RULE 26) 



WO 00/07038 



PCT/CA99/00702 



4 

used to study specific types of constitutively interacting proteins or interactions 
of proteins with other molecules, in narrowly defined cellular and compartmental 
contexts and require a complex cellular machinery (transcription) to work. To 
rationally screen for protein interactions within the context of a specific problem 
5 requires more flexible approaches. Specifically, assays that meet criteria 
necessary not only to detecting molecular interactions, but also to validating 
these interactions as specific and biologically relevant are required. 

A list of assay characteristics that meet such criteria are as 

follows: 

10 1) Allow for the detection of protein-protein, protein-DNA/RNA or protein-drug 
interactions in vivo or in vitro. 

2) Allow for the detection of these interactions in appropriate contexts, such as 
within a specific organism, cell type, cellular compartment, or organelle. 

3) Allow for the detection of induced versus constitutive protein-protein 
15 interactions (such as by a cell growth or inhibitory factor). 

4) Allow for a distincton between specific and non-specific protein-protein 
interactions by controlling the sensitivity of the assay. 

5) Allow for the detection of the kinetics of protein assembly in cells. 

6) Allow for screening of cDNA, small organic molecule, or DNA or RNA libraries 
20 for molecular interactions. 

The present description refers to a number of documents, the 
content of which is herein incorporated by reference. 

SUMMARY OF THE INVENTION 

25 The present invention seeks to provide the above-mentioned 

needs for which the prior art is silent. The present invention provides a general 
strategy for detecting protein interactions with oher biopolymers including other 
proteins, nucleic acids, carbohydrates or for screening small molecule libraries 
for compounds of potential therapeutic value. In a preferred embodiment, the 

30 instant invention seeks to provide an oligomerization-assisted complementation 
of fragments of monomeric enzymes that require no other proteins for the 
detection of their activity. In one such embodiment, a protein-fragment 
complementation assay (PCA) based on ^constitution of dihydrofolate 
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reductase activity by complementation of defined fragments of the enzyme in 
£ co// is hereby provided. This assay requires no additional endogenous factois 
for detecting specific protein-protein interactions (i.e. leucine zipper interactions) 
and can be conveniently extended to screening cDNA, nucleic acid, small 
5 molecule or protein design libraries for molecular interactions. In addition, the 
assay can also be adapted to the detection of protein interactions in any cellular 
context or compartment and be used to distinguish between induced versus 
constitutive protein interactions in both prokaryotic and eukaryotic systems. 

The individual PCAs presented here are completely de novo 

10 designed interaction detection assays, not described in any way previously 
except for publications arising from applicants laboratory. Secondly, this 
application describes a general strategy to develop molecular interaction assavs 
from a large number of enzyme or protein detectors, all de novo designed 
assays, whereas the (J-gal assay is not novel. Thirdly, there are no general 

1 5 strategies or advancements over previously well documented applications given 
in the art. 

One particular strategy for designing a protein 
complementation assay (PCA) is based on using the following characteristics: 
1) A protein or enzyme that is relatively small and monomelic, 2) forwhich there 

20 is a large literature of structural and functional information, 3) for which simple 
assays exist for the reconstitution of the protein or activity of the enzyme, both 
in vivo and in vitro, and 4) for which overexpression in eukaryotic and 
prokaryotic cells has been demonstrated. If these criteria are met, the stucture 
of the enzyme is used to decide the best position in the polypeptide chain to spit 

25 the gene in two. based on the following criteria: 1) The fragments should result 
in subdomains of continuous polypeptide; that is, the resultingfragments will not 
disrupt the subdomain structure of the protein, 2) the catalytic and cofactor 
binding sites should all be contained in one fragment, and 3) resulting new/V- 
and C-termini should be on the same face of the protein to avoid the need for 

30 long peptide linkers and allow for studies of orientation-dependence of protein 
binding. 

It should be understood that the above mentioned criteria do 
not all need to be satisfied for a proper working of the present invention. Its an 
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advantage that the enzyme be small, preferably between 10-40 kDa. Although 
monomelic enzymes are preferred, multimeric enzymes can also be envisaged 
as within the scope of the present invertion. The dimeric protein tyrosinase can 
be used in the instant assay. The information on the structure of the enzyme 
5 provides an additional advantage in designing the PCA, but is not necessary. 
Indeed, an additional strategy, to develop PCAs is presented, based on a 
combination of exonuclease digestion-generated protein fragements followed ty 
directed protein evolution in application to the enzyme aminoglycoside kinase. 
Although the overexpression in prokaryotic cells is preferred it is not a necessity. 
10 It will be understood to the skilled artisan that the enzyme catalytic site (of the 
chosen enzyme) does not absolutely need to be on same molecule. . 

The present application explains the rationale and criteria for 
using a particular enzyme in a PCA. For PCA, a gene for a potein or enzyme 
is rationally dissected into two or more fragments. Using molecular biology 
15 techniques, the chosen fragments are subcloned, and to the 5' ends of each, 
proteins that either are known or thought to interact are fused. Co-transfection 
or transformation these DNA constructs into cells is then carried out. 
Reassembly of the probe protein or enzyme from its fragments is catalyzed ty 
the binding of the test proteins to each other, and reconstitution is observed wih 
20 some assay. It is crucial to understand that these assays will only work if the 
fused, interacting proteins catalyze the reassembly of the enzyme. That is, 
observation of reconstituted enzyme activity must be ameasure of the interaction 
of the fused proteins. 

A preferred embodiment of the present invention focuses cn 
25 a PCA based on the enzyme dihydrofolate reductase. Expansion of the 
strategy to include assays in eukaryotic. cells, library screening, and a specific 
application to problems concerning the study of integrated biochemical 
pathways such as signal transduction pathways, is presented. Additional 
assays, including those based on enzymes that can act as dominant or recessive 
30 drug selection or metabolic salvage pathways are disclosed. In addition, PCAs 
based on enzymes that will produce a colored or fluorescent product are also 
disclosed. The present invention teaches how the PCA strategy can be both 
generalized and automated for functional testing of novel genes, screening of 
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natural products or compound libraries for pharmacological activity and 
identification of novel gene products that interact with DNA t RNA or 
carbohydrates. It also teaches how the PCA strategy can be applied to 
identifying natural products or small molecules from compound libraries of 
5 potential therapeutic value that can inhibit or activate such molecular interactions 
and how enzyme substrates and small molecule inhibitors of enzymes can be 
identified. Finally, it teaches how the PCA strategy can be used to perform 
protein engineering experiments that could lead to designed enzymes with 
industrial applications or peptides with biological activity. 

10 Simple strategies to design and implement assays for 

detecting protein interactions in vivo are disclosed herein. Complementary 
fragments of the native mDHFR have been designed such that, when 
coexpressed in £ coli grown in minimal medium, they allow forsurvival of clones 
expressing the two fragments, where the basal activity of the endogenous 

15 bacterial DHFR is inhibited by the competitive inhibitor trimethoprim. 
Reconstitution of activity only occurred when both A/- and C-terminal fragments 
of DHFR were coexpressed as C-terminal fusions to GCN4 leucine zipper 
sequences, indicating that reassembly of the fragments requires formation of a 
leucine zipper between the N- and C-terminal fusion peptides. The sequential 

20 increase in cell doubling times resulting from the destabilizing mutations 
directed at the assembly interface (Ile1 14 to Val. Ala or Gly) demonstrates that 
the observed cell survival under selective conditions is a result of the specific, 
leucine-zipper-assisted association of mDHFR fragment[1 ,2] with fragment[3], 
as opposed to nonspecific interactions of Z-F[3] with Z-F[1 ,2]. Several detailed 

25 and many additional examples are given. 

As demonstrated previously with the ubiquitin-based split 
protein sensor (USPS) 9 , a protein-fragment complementation strategy can be 
used to study equilibrium and kinetic aspects of protein-protein interactions//? 
vivo. The DHFR and other PCAs however, are simpler assays. They are 

30 complete systems; no additional endogenous factors are necessary and the 
results of complementation are observed directly, with no further manipulation. 
The £ coli cell survival assay described herein should therefore be particularly 
useful for screening cDNA libraries for protein-protein interactions. mDHFR 
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expression in cells can be monitored by binding of fluorescent high-affinity 
substrate analogues for DHFR 26 . 

There are several further aspects of the PCAs that distingutii 
them from all other known strategies for studying protein-protein interactions in 
5 vivo (except USPS). Complementary fragments of enzymes that allow for 
controlling the stringency of the assay have been designed, and could be used 
to obtain estimates of the kinetics and equilibrium constants for association of 
two proteins. For example, with DHFR the point mutations of the wild-type 
enzyme He 1 14 to Val, Ala, or Gly alterthe stringency of reconstitution of DHFR 

10 activity. For determining estimates of equilibrium and kinetic parameters fora 
specific protein-protein interaction, one could perform a series of DHFR PCA 
experiments with two proteins that interact with a known affinity, usingthe wild 
type or destabilizing mutant DHFR fragments. Comparison of cell growth rates 
in this model system with rates for a DHFR PCA ising unknowns would give an 

1 5 estimate of the strength of the unknown interaction. 

It should be understood that the present invention shoild not 
be limited to the DHFR or other PCAs presented herein, as they serve only as 
non-limiting embodiments of the protein complementation assay of the present 
invention. Moreover, the PCAs should not be limited in the cortext in which they 

20 could be used. Constructs could be designed for targeting the PCA fusions to 
specific compartments in the cell by addition of signaling peptidesequences 27 - 28 . 
Induced versus constitutive protein-protein interactions could be distinguished 
by a eukaryotic version of the PCA, in the case of an interaction that is triggered 
by a biochemical event. Also, the system could be adapted for use in screening 

25 for novel, induced protein-molecular associations between a target protein and 
an expression library. 

The instant invention is also directed to a method for detectiip 
biomolecular interactions, the method comprising: 
(a) selecting an appropriate reporter molecule; 

30 (b) effecting fragmentation of the reporter molecule such that the fragmentation 
results in reversible loss of reporter function; 

(c) fusing or attaching fragments of the reporter molecule separately to other 
molecules; followed by 



SUBSTITUTE SHEET (RULE 26) 



WO 00/07038 



PCT/CA99/00702 



9 

(d) reassociation of the reporter fragments through interactions of the molecules 
that are fused to the fragments. 

The invention also provides molecular fragment 
complementation assays for the detection of molecular interactions comprising 
5 a reassembly of separate fragments of a molecule, wherein reassembly of the 
fragments is operated by the interaction of molecular domains fused to each 
fragment of the molecules, and wherein reassembly of the fragments is 
independent of other molecular processes. 

In another aspect, the present invention is directed to a 
1 0 method of testing biomolecular interactions comprising: 

a) generating a first fusion product comprising 

i) a first fragment of a first molecule; and 

ii) a second molecule which is different or the same as the first molecule; 

b) generating a second fusion product comprising 
15 i) a second fragment of the first molecule; and 

ii) a third molecule which is different from or the same as thefirst molecule 
or second molecule; 

c) allowing the first and second fusion products to contact each other, and 

d) testing for activity regained by association of the recombined fragments of he 
20 first molecule, wherein the reassociation is mediated by interaction of the 

second and third molecules. 

In another novel feature, the invention is directed to a method 
comprising an assay where fragments of a first molecule are fused to a second 
molecule and fragment association is detected by reconstitution of the first 
25 molecule's activity. 

The present invention also provides a composition comprising 
a product selected from the group consisting of: 

(a) a first fusion product comprising: 

i) a first fragment of a first molecule whose fragments can exhibit a 
30 detectable activity when associated and 

ii) a second molecule that can bind (a)(i); 

(b) a second fusion product comprising 

i) a second fragment of the first molecule and 
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ii) a third molecule that can bind (b)(i); and 
c) both (a) and (b). 

The invention further provides a composition comprising 
complementary fragments of a first molecule, each fused to a separate fragmert 
5 of a second molecule. 

The inventors of the present subject matter further provide a 
composition comprising a nucleic acid molecule coding for a fusion product, 
which molecule comprises sequences coding for a product selected from the 
group consisting of: 
10 (a) a first fusion product comprising: 

i) fragments of a first molecule whose fragments can exhibit a detectable 
activity when associated and 

ii) a second molecule fused to the fragment of the first molecule; 
(b) a second fusion product comprising 

15 i) a second fragment of the first molecule and 

ii) a second or third molecule; and 
c) both (a) and (b). 

The present invention is also directed to a method of testing 
for biomolecular interactions associated with: (a) complementary fragments of 
20 a first molecule whose fragments can exhibit a detectable activity when 
associated or (b) binding of two protein-protein interacting domains from a 
second or third molecule, the method comprising: 

1) creating a fusion of 

(a) a first fragment of a first molecule whose fragments can exhibit a 
25 detectable activity when associated and 

(b) a first protein-protein interacting domain: 

2) creating a fusion of 

(a) a second fragment of the first molecule and 

(b) a second protein-protein interacting domain that can bind the first 
30 protein-protein interacting domain; 

3) allowing the fusions of (1) and (2) to contact each other; and 

4) testing for the activity. 
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The instant invention further provides a composition 
comprising a product selected from the group consisting of: 

(a) a first fusion product comprising: 

i) a first fragment of a molecule whose fragments can exhibit a detectable 
activity when associated and 

ii) a first protein-protein interacting domain: 

(b) a second fusion product comprising 

i) a second fragment of the first molecule and 

ii) a second protein-protein interacting domain that can bind the first 
protein-protein interacting domain; and 

(c) both (a) and (b). 

The invention is also directed to a composition comprising a 
nucleic acid molecule coding for a fusion product, which molecule comprises 
sequences coding for either: 

(a) a first fusion product comprising: 

i) a first fragment of a molecule whose fragments can exhibit a detectable 
activity when associated and 

ii) a first protein-protein interacting domain; or 

(b) a second fusion product comprising 

i) a second fragment of the molecule and 

ii) a second protein-protein interacting domain that can bind the first 
protein-protein interacting domain; or 

(c) both (a) and (b). 

The invention also provides a method of detecting kinetics of 
protein assembly and screening cDNA libraries comprising performing PCA. 

In another embodiment, the invention further provides a 
method of testing the ability of a compound to inhibit molecular interactions in a 
PCA comprising performing a PCA in the presence of the compound and 
correlating any inhibition with the presence. 

In a further embodiment, the invention provides a method for 
detecting protein-protein interactions in living organisms and or cells, which 
method comprises: 
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(a) synthesizing probe protein fragments from an enzyme which enables 
dominant selection by dissecting the gene coding for the enzyme into at least tw> 
fragments; 

(b) constructing fusion proteins with one or more molecules that are tobe tested 
5 for interactions; 

(c) fusing the proteins obtained in (b) with one or more of the probe fragments; 

(d) coexpressing the fusion proteins; and 

(e) detecting the reconstitution of enzyme activity. 

The invention still provides a method for detecting 
10 biomolecular interactions, the method comprising: 

(a) selecting an appropriate reporter molecule; 

(b) effecting fragmentation of the reporter molecule; 

(c) fusing or attaching fragments of the reporter molecule separately to other 
molecules; followed by 

15 (d) reassociation of the reporter fragments through interactions ofthe molecules 
that are fused to the fragments. 

The invention further relates to a method employing a Proteh 
Complementation assay/Universal Reporter System (PCA/URS) for detecting 
and screening for agonists and antagonists of a membrane receptor, which 

20 method comprises: 

a) generating a first nucleic acid vector encoding a first fusion product 
comprising: 

i) a first fragment of a first PCA/URS reporter molecule, and 

ii) a second molecule, fused to the first fragment, which comprises a first 
25 subdomain of a cellular receptor molecule of interest; 

b) generating a second nucleic acid vector encoding a second fusion product 
comprising: 

i) a second fragment of the first PCA/URS reporter molecule, and 

ii) a third molecule, fused to the second fragment, which comprises a 
30 second subdomain of the cellular receptor, and where the second 

subdomain may be the same as the first subdomain in the case of a 
homodimeric cellular receptor, or different from the first subdomain in the 
case of a heterodimeric cellular receptor: 
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c) transfecting prokaryotic or eukaryotic cells with the first and second nucleic 
acid vectors; 

d) testing the transfected cells for the PCA/URS reporter activity, the activity 
indicating reassociation of the first and second fragments of the PCA/URS 

5 molecule mediated by the interaction of the first and second subdomains of the 
cellular receptor molecule. 

In a further embodiment, the invention isdirected to a method 
employing a Protein Complementation Assay/Universal Reporter System 
(PCA/URS) for detecting and screening for agonists and antagonists of a 
10 membrane receptor which method comprises: 

a) generating a first nucleic acid vector encoding a first fusion 
product comprising: 

i) a first fragment of a first PCA/URS reporter molecule, and 

ii) a second molecule, fused to the first fragment, which comprises a first 
1 5 subdomain of a cellular receptor molecule of interest: 

b) generating a second nucleic acid vector encoding a second fusion product 
comprising: 

i) a second fragment of the first PCA/URS reporter molecule, and 

ii) a third molecule, fused to the second fragment, which comprises a 
20 second subdomain of the cellular receptor, and where the second 

subdomain may be the same as the first subdomain in the case of a 
homodimeric cellular receptor, or different from the first subdomain in the 
case of a heterodimeric cellular receptor; 

c) transfecting prokaryotic or eukaryotic cells with the first and second nucleic 
25 acid vectors; 

d) obtaining a clonal population of cells that express the first and second fusion 

products; and 

e) testing the transfected cells for the PCA/URS reporter activity, the activity 
indicating reassociation of the first and second fragments of the PCA/URS 

30 molecule mediated by the interaction of the first and second subdomains of the 
cellular receptor molecule. 

In another embodiment, the invention relates to a method 
employing a Protein Complementation Assay/Universal Reporter System 
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(PCA/URS) for detecting and screening for agonists and antagonists of a 
membrane receptor, which method comprises: 

a) generating a first nucleic acid vector encoding a first fusion product 
comprising: 

5 i) a first fragment of a first PCA/URS reporter molecule. 

ii) a first linker, fused at one end to the first fragment, the linker region 
comprising between 1 and 30 amino acid residues; and iii) a second 
molecule, fused to the other end of the first linker, which comprises a first 
subdomain of a cellular receptor molecule of interest; 
10 b) generating a second nucleic acid vector encoding a second fusion product 
comprising: 

i) a second fragment of the first PCA/URS reporter molecule, 

ii) a second linker, fused at one end to the second fragment, the linker 
comprising between 1 and 30 amino acid residue; and 

15 iii) a third molecule, fused to the other end of the second linker, which 

comprises a second subdomain of the cellular receptor, and where the 
second subdomain may be the same as the first subdomain in the case of 
a homodimeric cellular receptor, or different from the first subdomain in the 
case of a heterodimeric cellular receptor; 

20 c) transfecting prokaryotic or eukaryotic cells with the first and second nucleic 
acid vectors; 

d) testing the transfected cells for the PCA/URS reporter activity, the activity 
indicating reassociation of the first and second fragments of the PCA/URS 
molecule mediated by the interaction of the first and second subdomains of the 
25 cellular receptor molecule. 

Lastly, the invention also provides a novel method of affecting 
gene therapy, which includes the step of providing the assays and compositions 
described above. 

The present invention is pionneering as it is the first protein 
30 complementation assay displaying such a level of simplicity and versatility. The 
exemplified embodiments are protein-fragment complementation assays (PCA) 
based on mDHFR, where a leucine zipper directs the reconstitution of DHFR 
activity. Activity was detected by an E. coli survival assay which is both practical 
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and inexpensive. This system illustrates the use of mDHFR fragment 
complementation in the detection of leucine zipper dimerization and could be 
applied to the detection of unknown, specific protein-molecular interactions in 
vivo. 

5 It should be undertstood that the instant invention is not 

limited to the PCAs presented here, as numerous other enzymes can be 
selected and used in accordance with the teachings of the present invention. 
Examples of such markers can be found in Kaufman, (1987 Genetic Eng. 9:155- 
198) and references found therein as well as table 1 of this application. 

10 It should also be clear to the skilled artisan to which the 

present invention pertains that the invention is not limited to the use of leucine 
zippers as the two interacting molecules. Indeed, numerous other types of 
protein-molecule interactions can be used and identified in accordance with the 
teaching of the present invention. The knowntypes of motifs involved in proteirv 

15 molecular interactions are well known in the art. 

The present application refers to numerous prior art 
documents and the entire contents of all those prior art documents are herein 
incorporated by reference in their entirety. 

Other features and advantages of the present invention will 

20 be apparent from the following description ofthe preferred embodiments thereof, 
the appended Examples and from the enjoined claims. 

RRipr DESCRIPTION OF THE DRAWINGS 

Having thus generally described the invention, reference will 
25 now be made to the accompanying drawings, showing by way of illustration a 
preferred embodiment thereof, and in which: 

FIG. 1 provides a general description of a PCA. Using 
molecular biology techniques, the chosen fragments of the enzyme are 
subcloned. and to the 5' ends of each, proteinsthat either are known or thought 
30 to interact are fused. Co-transfection or transformation these DNA constructs 
into cells is then carried out and reconstitution with some assay is observed. 

FIG. 2 is a scheme ofthe fusion constructs used in one ofthe 
embodiments of the invention. The hexahistidine peptide (6His), the homo- 
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dimerizing GCN4 leucine zipper (Zipper) and mDHFR fragments (1, 2 and 3) ae 
illustrated. The labels for the constructs are used to identify both the DNA 
constructs and the proteins expressed from these constructs. 

FIG. 3: (A) shows £ coli survival assay on minimal medium 
5 plates. Control: Left side of the plate: £ coli harboring pQE-30 (no insert); right 
side; £ coli harboring pQE-16, coding for native mDHFR. Panel I: Left side of 
each plate: transformation with construct Z-F[1 ,2]; right side of each plate: 
transformation with construct Z-F[3]. Panel II: Cotransformation with constructs 
Z-F[1 t 2] and Z-F[3]. Panel III: Cotransformation with constructs Control-F[1 ,2] 

10 and Z-F[3]. All plates contain 0.5 mg/ml trimethoprim. In panels I to III, plates a 
the right side contain 1mM IPTG. 

(B) £ co// survival assay using destabilizing DHFR mutants. 
Panel I: Cotransformation of £ coli with constructs Z-F[1 ,2] and Z-F[3:lle1 14ValJ 
Panel II: Cotransformation with Z-F[1 ,2] and Z-F[3:lle1 14Ala], Inset is a 5-fold 

15 enlargement of the right-side plate. Panel III: Cotransformation with Z-F[1 ,2] arti 
Z-F[3:IIe1 14Gly]. All plates contain 0.5 mg/ml trimethoprim. Plates on the right 
side contain 1mM IPTG. 

FIG. 4 features the coexpression of mDHFR fragments. (A) 
Agarose gel analysis of restriction pattern resulting from Hindi digestion of 

20 plasmid DNA. Lane 1 contains DNA isolated from £ coli cotransformed with 
constructs Z-F[1 ,2] and Z-F[3]. Lanes 2 and 3 contain DNA isobted from £ coli 
transformed with, respectively, construct Z-F[3] and construct Z-F[1,2]. 
Fragment migration (in bp) is indicated to the right. 

(B) SDS-PAGE analysis of mDHFR fragment expression. 

25 Lanes 1 to 5 show crude lysate of untransformed £ coli (lane 1), or £. coli 
expressing Z-F[1,2] (20.8 kDa; Iane2) t Z-F[3] (18.4 kDa; lane 3), Control-F[1,2] 
(14.2 kDa; lane 4), and Z-F[1 ,2] + Z-F[3] (lane 5). Lane6 shows 40 ml out of 2ml 
copurified Z-F[1,2] and Z-F[3]. Arrowheads point to the proteins of interest. 
Migration of molecular weight markers (in kDa) is indicated to the right. 

30 FIG. 5 illustrates the general features of a PCA based on a 

survival assay such as the DHFR PCA. The assay can be used in a bacterial . 
or a mammalian context. The inserted target DNA can be a known sequence 
coding for a protein (or protein domain) of interest, or can be a cDNA library. 
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FIG. 6 represents an autoradiographof a COS cell lysate after 
a 30 min. 35 S-Met-Cys pulse-labelling. The expression pattern is essentially 
identical to that observed in E coli (see Fig. 4). The DNA transfected into the 
cells (or cotransfected) is indicated above the respective lanes. 
5 FIG. 7 illustrates the results of a protein engineering 

application of the mDHFR bacterial PCA. Two semi-random leucine zipper 
libraries were created (as described in the text) and each inserted A/-terminal to 
one of the mDHFR fragments. Cotransformationof the resulting zipper-DHFR 
fragment libraries inE. coli and plating on selective medium allowed for survival 

10 of clones harboring successfully interacting leucine zippers. Fourteen clones 
were isolated and the zippers were sequenced to identify the residues at the u e n 
and "g" positions. The "e-g" pairs were categorized, as having attractive pairing 
(chargexharge, charge.neutral polar or neutral polarneutral polar) or repulsive 
pairing (chargexharge) and the number of each type of interaction scored for 

15 each clone. The total number of interactions for eachclone is 6; the interactions 
are tallied on the histogram. 

Fig. 8 is a schematic representation of Structure-based Epo 
receptor activation hypothesis (upper panel) and of the experimental strategy 
to test it (lower panel), (upper panel) Receptors are constitutive dimers in their 

20 unligated state. The extracellular domain exists in a conformation the holds the 
intracellular domains and associated JAK2 separated from each other by 
approximately 80 A. On binding ligand (Epo or peptide agonist EMP1) the 
extracellular dimer is reorganized, bringing the intracellular domains to within 30 
A of each other, allowing autophosphorylation and activation of the JAK2s. (ii); 

25 The extracellular and transmembrane domains of murine EpoR are fused to one 
of two complementary fragments of murine DHFR (F[1,2] or F[3]) via flexible 
linkers (gray lines) consisting of (Gly.Gly.Gly.Gly.Ser) N repeats where N=1,2 or 
6 to generate the following: EpoR-5aa-F[1,2] or -F[3], EpoR-10aa-F[1,2] or -F(3] 
EpoR-30aa-F[1,2] or -F[3] (See the legend of figure 9). Cells transfected with 

30 these fusions express receptors at the membrane surface. Fluorescein- 
methotrexate (fMTX) is taken up by cells and binds to reconstituted DHFR 
(F[1,2]+F[3]) and is retained in the cell. Unbound fMTX is rapidly released from 
the cells by active transport. Fusions in which DHFR fragments are connected 
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to receptors by 5 or 10 amino acid linkers cannot or weakly complement in the 
inactive receptor (minimum separations or 40 or 80 A, respectively). When 
receptor bind to Epo or EMP1 DHFR complementation can take place 
(separation 34 A), (iii) Fusions with the 30aa linker allow complementation of 
5 DHFR fragments whether receptors are ligand bound or not. (iv) Results of 
complementation experiments with EpoR extracellular and transmembrane 
domain should be reproducible with complete EpoR receptor complex, including 
associated JAK2. Here is shown one such experiment in which DHFR fragments 
are fused to the C-terminal of JAK2 and co-expressed in cells along with full 
10 length EpoR. 

Fig. 9 are the results of fluorescence microscopy of CHO 
DUKX-B11 cells expressing EpoR extracellular and transmembrane domains 
fused to DHFR complementary fragments (Constructs described in Fig. 8) and 
exposed to fMTX in the presence or absence of Epo or EMP1 . All fusion clones 

15 were generated by PCR amplification of individual genes of interest. The 
construction of the DHFR F[1,2] and F[3] have been previously described. 
Oligonucleotides coding for Flexible linker peptides were synthesized individually 
with 5' and 3' complementary overhangs corresponding to 5' or 3' insertion 
between EpoR and DHFR fragment encoding sequences, regions of each 

20 construct were subcloned into the mammalian expression vector pMT3. Cells 
were stably lipofectamine (Life Technologies/ GibcoBRL) transfected with EpoR- 
DHFR fragment and stable colonies selected on alpha-MEM enriched with 
dialyzed 10 % fetal bovine serum (dialyzed to remove nucleotides, rendering 
cells dependent on exogenous DHFR activity) and in the presence of 2 nM 

25 human recombinant Epo (R. W. Johnson Pharmaceutical Research Institute). 

For microscopy, cells were grown on 18 mm glass cover slips to approximately 
5 

1x10 in 12 well plates. fMTX (Molecular Probes) was added to each sample 
at a final concentration of 10 iM and incubated for 22 hours at 37°C. Prior to 
microscopy, cells were treated with 10 nM Epo or 10 iM EMP1 for30 minutes at 
30 37 °C. The medium was removed and the cells were washed with PBS 
(phosphate-buffered saline) extensively and reincubated for 15 minutes in alpha- 
MEM and Epo or EMP1 to allow for efflux of unbound fMTX. Medium was 
removed and cells were washed 4 times with PBS on ice and finally mountedon 
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glass slides. Fluorescent microscopy was performed on live cells with a Zeiss 
Aviovert 10 inverted microscope (objective lens Zeiss Plan Neofluor 10/0.75). 

Fig. 10 (A) illustrates the fluorescent flow cytometric analysis 
of EPO or EMP1 induced response in CHO-DUKX-B1 1 cells expressing EpoR- 
5 OHFR fragment fusions and labeled with fMTX. (A); upper panel; Cells 
transfected with EpoR-5aa-F[1 .2] and -F[3]; middle panel, with EpoR-1 0aa-F(1 .2) 
and -F[3] and lower panel, with EpoR-30aa-F[1.2] and -F[3]. Histograms are 
based on analysis of fluorescence intensity for 10,000 cells at flow rates of 
approximately 1000 cells per second. Data were collected on a Coulter XL 4 

10 color FACS analyzer (Coulter-Beckman) with stimulation with an argon laser 
tuned to 488 nm with emission recorded through a 525 nm bands filter. 
Histograms represent response in absence of ligands (black trace), with 10 nM 
Epo (dark gray trace) or 10 iM EMP1 (light gray trace). Preparation ofcells for 
analysis was the same as described for microscopy (see Fig. 9). except that 

1 5 following the PBS wash, cells were gently trysinized, suspended in 500 iL of coH 
PBS supplemented with 10% FBS in order to increase cell viability and kept on 
ice prior to cytometric analysis within 20 minutes. 

Fig. 10 (B) are the dose-response curves for Epo and EMP1 
based on flow cytometric analysis of CHO DUKX-B1 1 cells expressing EpoR- 

20 5aa-F[1 ,2] and -F[3] as in (A), upper panel. Mean fluorescence intensity were 
determined for three separate samples at each ligand concentration (between 
0.0003 nM and 100 nM. Epo (upper panel) or between 0.0003 iMand 100 iM fa 
EMP1 (lower panel). X-axis is the mean fluorescence intensity relative to the 
maximum intensity observed and renormalized to zero for the minimum 

25 response. Traces through data points represents non-linear least-squares fit of 
results to a Langmuir isotherm determined in the computer program MacCurveFt 
(Kevin Raner Software) with a Quasi-Newton optimization routine (r 2 and residual 
error for Epo curve were 0.98 and 0.045. respectively and for EMP1 curve 0.99 
and 0.022). 

30 Fig. 11 is directed to fluorescence microscopy ofCOS-7 cells. 

Plasmids pMT3 harboring full length EpoR or EpoR or JAK2 fused via 5aa to 
F[1.2] or F[3] were created and COS-7 cells were transiently transfected or 
cotransfected with the different clones, and treated and analyzed as in figure 9. 



SUBSTITUTE SHEET (RULE 26) 



WO 00/07038 



PCT/CA99/00702 



20 

Other objects, advantages and features of the present 
invention will become more apparent upon reading of the following non-restrictive 
description of preferred embodiments with reference to the accompanying 
drawings which are exemplary and should not be interpreted as limiting the 
5 scope of the present invention. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 
Selection of mDHFR for a PCA 

In designing a protein-fragment complementation assay 

10 (PCA), it was sought to identify an enzyme for which the following is true: 1) An 
enzyme that is relatively small and monomeric. 2) for which structural and 
functional information exists, 3) for which simple assays exist for both in vivo and 
in vitro measurement, and 4) for which overexpression in eukaryotic and 
prokaryotic cells had been demonstrated. Murine DHFR (mDHFR) meets all of 

15 the criteria for a PCA listed above. Prokaryotic and eukaryotic DHFR is central 
to cellular one-carbon metabolism and is absolutely required for cell survival in 
both prokaryotes and eukaryotes. Specifically it catalyses the reduction of 
dihydrofolate to tetrahydrofolate for use in transfer of one-carbon units required 
for biosynthesis of serine, methionine, purines and thymidylate. The DHFRs ae 

20 small (17 kD to 21 kD), monomeric proteins. The crystal structures of DHFR 
from various bacterial and eukaryotic sources are known and substrate binding 
sites and active site residues have been determined 111 * 114 , allowing for rational 
design of protein fragments. The folding, catalysis, and kinetics of a number of 
DHFRs have been studied extensively 115 " 119 . The enzyme activity can be 

25 monitored in vitro by a simple spectrophotometry assay 120 , or in vivo by cell 
survival in cells grown in the absence of DHFR end products. DHFR is 
specifically inhibited by the anti-folate drug trimethoprim. As mammalian DHFR 
has a 12000-fold lower affinity for trimethoprim than does bacterial DHFR 21 , 
growth of bacteria expressing mDHFR in the presence of trimethoprim levels 

30 lethal to bacteria is an efficient means of selecting for reassembly of mDHFR 
fragments into active enzyme. High level expression of mDHFR has been 
demonstrated in transformed prokaryote or transfected eukaryotic cell^ 22 * 128 . 
Design Considerations 
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mDHFR shares high sequence identity with the human DHFR 
(hDHFR) sequence (91% identity) and is highly homologous to the E coli 
enzyme (29% identity, 68% homology) and these sequences share visually 
superimposable tertiary structure 111 . Comparison of the crystal structures of 
5 mDHFR and hDHFR suggests that their active sites are essentially identical 127 ' 128 . 
DHFR has been described as being formed of three structural fragments forming 
two domains 129 - 130 the adenine binding domain (residues 47 to 1 05= fragment^]) 
and a discontinuous domain (residues 1 to 46 = fragment[1] and 106 to 186 [3]; 
numbering according to the murine sequence). The folate binding pocket and 

10 the NADPH binding groove are formed mainly by residues belonging to 
fragments[1]and[2]. Fragment [3] is not directly implicated in catalysis. 

Residues 101 to 108 of hDHFR, at the junction between 
fragment[2] and fragment[3], form a disordered loop which lies on the same fa® 
of the protein as both termini. It was chosen to cleave mDHFR between 

1 5 fragments [1 ,2] and [3], at residue 1 07, so as to cause minimal disruption of the 
active site and NADPH cofactor binding sites. The native A/- terminus of mDHFR 
and the novel AMerminus created by cleavage occur on the same surface of the 
enzyme 112, 128 allowing for ease of AMerminal covalent attachment of each 
fragment to associating fragments such as the leucine zippers used in this study. 

20 Using this system, a leucine-zipper assisted assembly of the mDHFR fragments 
into active enzyme was obtained. 

The present invention further illustrates that signaling by the 
Erythropoietin Receptor is mediated by a ligand-induced conformation change 
in constitutive receptor dimers. Erythropoietin and other cytokine receptors are 

25 thought to be activated through hormone-induced dimerization and 
autophosphorylation of JAK kinases associated with the receptor intracellular 
domains. Using an in vivo protein fragment complementation assay based on 
murine dihydrofolate reductase association with a fluorescent probe, applicants 
have discovered that constitutive erythropoietin receptor dimers exist in a 

30 conformation that prevents assocation of JAK2 but undergoes a ligand-induced 
conformation change that allows JAK2 to self-associate. These results are 
consistent with crystallographic evidence for the conformations of native and 
ligand-bound forms of the Erythropoietin receptor. 
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It is also known that Erythropoietin (Epo) regulates 
proliferation and differentiation of erythroid progenitors. Many disorders of 
erythroid proliferation are caused by genetic disorders of erythropoietin 
biosynthesis or of genetic disruption of Epo synthesis or Epo receptor-mediated 
5 signal transduction (Foa, P., Acta Haematologica 86, 162-8 (1991); Watowich, 
et al., Annual Review of Cell & Developmental Biology 12, 91-128 (1996); 
Spivak, J.L., Transactions of the American Clinical & Climatological Association 
102, 232-42 (1990) and Lodish et al., Cold Spring Harbor Symposia on 
Quantitative Biology 60, 93-104 (1995)). Such disorders include anemias due to 
1 0 renal failure, cancer chemotherapy and AZT treatment (Krantz, S.B., Blood 77, 
419-34 (1991)). The Epo receptor (EpoR) shares both structural and functional 
features with the cytokine receptor superfamily that includes the interieukins, 
human growth hormone (hGH) and colony stimulating factor CSF (D' Andrea et 
al., Cell 58, 1023-1024 (1989); Bazan, J.F., Proceedings of the National 
15 Academy of Sciences of the United States of America 87, 6934-8 (1990) and 
Stahl et al., Ce//74, 587-590 (1993)). Functionally, the initial events in receptor- 
mediated signaling are the association, autophosphorylation and activation of 
one or two forms of the JAK family of tyrosine kinases (Chantler et al., 
Biophysical Journal 59, 1242-50 (1991); Finbloom et al., Cellular Signalling 7, 
20 739-745 (1 995); Ihle et al., Annual Review of Immunology 1 3, 369-398 (1 995); 
Witthuhn. W.A.. et al., Cell 74. 227-236 (1 993)). Binding of JAKs to the receptors 
is mediated by common sequence elements (Box1 and Box2) of the intracellular 
domains of these receptors (Murikami, et al.. Proceedings of the National 
Academy of Science USA 88, 1 1349-1153 (1991) and Tanner et al., Journal of 
25 Biological Chemistry 270, 6523-6530 (1995)). Crystal structures of hGH bound 
to GH receptor and EpoR bound to an agonist peptide EMP1 have shown that 
both the tertiary structures and oligomeric states of these two receptors are 
identical (Livnah, et al., Science 273, 464-71 (1996) and De Vos, Science 255, 
306-312 (1992)). Both receptor-ligand complexes were found to be C2 
30 symmetric homo-dimers that bound through two different surfaces of the 
receptors to one molecule of GH in the case ofthe GH receptor or a dimer of tte 
EpoR agonist peptide. These studies, structures of other growth hormone 
receptors and biochemical analysis have led to the generally accepted 
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dimerization model of growth factor-mediated receptor activation. Monomelic 
membrane-bound receptors remain inactive until ligand binds to and 
oiigomerizes the receptors. The activation event is autophosphorylation of 
intrinsic intracellular or non-covalently associated kinases brought into contact 
5 by the dimerization of receptors. Dimer- or oligomerization of receptors is a 
necessary but clearly not a sufficient condition for receptor activation. Other 
model receptors such as insulin and bacterial chemotactic Tar receptors exist as 
dimers in absence of ligand, and Tar receptors have been demonstrated to 
undergo ligand-induced change in conformation mechanically coupled to 

10 activation of the cytosolic kinase domain. Until now there was no direct 
biochemical or structural evidence that ligand-mediated activation of cytokine 
receptors could also involve an allosteric mechanism. Wilson et al. a 1998 have 
solved the structure of unligated EpoR and shown that it is also a dimer, but wih 
a dramatically different arrangement of the two subunits. Among the features of 

1 5 the unligated extracellular domain, is that the C-terminals of the monomers, the 
points of insertion into the membrane, are separated by 82 A, compared to 34 
A in the ligated form. Assuming thatthese structures reflect the conformation of 
a constitutive dimer in cells it could be proposed that receptor activation by 
receptor would consist of a ligand-induced reorganization of the dimerthat brings 

20 the intracellular domains into closer proximity and allows the associated JAK2s 
to come into contact and auto- phosphorylate (Fig. 8). 

Applicants have also developed a fluorescent assay based 
on dimerization-induced complementation of designed fragments of the enzyme 
murine dihydrofolate reductase (DHFR) (Pelletier etal., Protein Engineering 10, 

25 89 (1997)) (Figure 8). The basis for the assay is that complementary fragments 
of DHFR when expressed and reassembled in cells, will bind to the high affinity 
(Kd= 100 pM) fluorescein-conjugated inhibitor methotrexate (fMTX) in a 1:1 
complex. fMTX is retained in cells by this complex, while unbound is actively and 
rapidly transported out of the cells (Kaufman et aL Journal of Biological 

30 Chemistry 253, 5852-60 (1978) and Israel et al., Proceedings of the National 
Academy of Sciences of the United States of America 90, 4290-4 (1993)). In 
addition, binding of fMTX to DHFR results in an 4.5 fold increase in quantum 
yield. Bound fMTX and by inference reconstituted DHFR. can then be monitorel 
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by fluorescence microscopy, FACS or spectroscopy. Since the complex of fMTX 
with DHFR is 1:1. measured fluorescence can be calibrated to determine 
average numbers of complexes in individual cells or averages in a population of 
cells. To test the allosteric model of receptor activation, it was reasoned that if 
5 the receptor transmembrane domain is separated by the distance observed in 
the crystal structure of unligated EpoR. then DHFR fragments fused to the C- 
terminal of the transmembrane domains will complement only if ligand induces 
the necessary conformation change that allows the fragments to come into 
contact. Furthermore, the absolute regio- and stereospecific requirement that 

10 fragments be sufficiently close to fold-reassemble into the enzyme is three 
dimensional structure means that a false response that might occur if fused is 
unlikely (interacting proteins are merely proximal). In addition, insertion of 
flexible linker peptides of a critical length between the transmembrane domain 
and the fragments should result in constitutive complementation, insensitive to 

15 ligand. Based on the EpoR crystal structure, the minimum length of linker 
necessary for a constitutive response would be 10 amino acids, assuming the 
length of an average peptide bond is -4 A and the distance separating the 
fragments is 82 A. Longer linkers should result in complementation, independert 
of ligand. Linkers of 5, 10 and 30 amino acids corresponding to extended 

20 lengths of 20. 40. and 120 A, respectively were thus used (Fig. 8). 

The present invention is illustrated in further detail by the 
following non-limiting examples. 
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EXAMPLE 1 
EXPERIMENTAL PROTOCOL 

DNA Constructs 

Mutagenic and sequencing oligonucleotides were purchased 
5 from Gibco BRL. Restriction endonucleases and DNA modifying enzymes were 
from Pharmacia and New England Biolabs. The mDHFR fragments carrying 
their own iN-frame stop codon were subcloned into pQE-32 (Qiagen), 
downstream from and iN-frame with the hexahistidine peptide and a GCN4 
leucine zipper (Fig 1; Fig. 2). All final constructs were based on the Qiagen pQE 

10 series of vectors, which contain an inducible promoter-operator element (tac), a 
consensus ribosomal binding site, initiator codon and nucleotides coding fora 
hexahistidine peptide. Full-length mDHFR is expressed from pQE-16 (Qiagen). 
Expression vector harboring the GCN4 leucine zipper 

Residues 235 to 281 of the GCN4 leucine zipper (a 

15 Sall/BamHI 254 bp fragment) were obtained from a yeast expression plasmid 
pRS316 9 . The recessed terminus at the BamHI site was filled-in with Klenow 
polymerase and the fragment was ligated to pQE-32 linearized with 
Sall/Hindlll(filled-in). The product, construct Z, carries an open reading frame 
coding for the sequence Met-Arg-Gly-Ser followed by a hexahistidine tag and 13 

20 residues preceding the GCN4 leucine zipper residues. 
Creation of DHFR fragments 

The eukaryotic transient expression vector, pMT3 (derived 
from pMT2) 16 , was used as a template forPCR-generation of mDHFR containing 
the features allowing subcloning and separate expression of fragment^ ,2] and 

25 fragment^]. The megaprimer method of PCR mutagenesis® was used to 
generate a full-length 590 bp product. Oligonucleotides complementary to the 
nucleotide sequence coding for the N- and C-termini of mDHFR and containing 
a novel BspEI site outside the coding sequence were used as well as an 
oligonucleotide used to create a novel stop codon after fragment[1 ,2], followed 

30 by a novel Spel site for use in subcloning fragment[3]. 

Construction of a new multiple cloning region and subcloning of DHFR 
fragments H.21ap<i F31 
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Complementary oligonucleotides containing the novel 
restriction sites: SnaBI, Nhel. Spel and BspEI, were hybridizedtogether resulting 
in 5* and 3' overhangs complementary to EcoRI, and inserted into pMT3 at a 
unique EcoRI site. The 590 bp PGR product (described above) was digested 
5 with BspEI and inserted into pMT3 linearized at BspEI, yielding construct [12,3], 
The 610 bp BspEI/EcoNI fragment (coding for DHFR fragments ,2], followed by 
a novel stop and fragment[3] up to EcoNI) was filled in at EcoNI and subcloned 
into pMT3 opened with BspEI/Hpal, yielding construct F[1,2]. The 250 bp 
Spel/BspEI fragment of construct[1,2,3] coding for DHFR fragment[3] (with no 
in-frame stop codon) was subcloned into pMT3 opened with the same enzymes 
The stop codon of the wild-type DHFR sequence, downstream from fragment[3] 
in pMT3. was inserted as follows. Cleavage with EcoNI, present in both the 
inserted fragment[3] and the wild-type fragment[3] t removal of the 683 bp 
intervening sequence and religation of the vector yielded a construct of 
fragment[3] with the wild-type stop codon. construct F[3]. 
Creation of the expression constructs 

The 1051 bp and the 958 bp SnaBI/Xbal fragments of 
constructs F[1,2] and F[3], respectively, were subcloned into construct Z opened 
with Bglll(filled-in)/Nhel, yielding constructs Z-F[1,2] and Z-F[3] (Fig. 2). For the 
Control expression construct, the 180 bp Xmal/BspEI fragment coding for the 
zipper was removed from construct Z-F[1,2], yielding construct Control-F[1,2] 
(Fig. 2). 

Creation of Stability Mutants 

Site-directed mutagenesis was performed 50 to produce 
mutants at Ile1 14 (numbering of the wild-type mDHFR). The mutagenesis 
reaction was carried out on the Kpnl/BamHI fragment of construct Z-F[3] 
subcloned into pBluescript SK+ (Stratagene). using oligonucleotides that encode 
a silent mutation producing a novel BamHI site. The 206 bp Nhel/EcoNI 
fragment of putative mutants identified by restriction was subcbned back into Z- 
F[3]. The mutations were confirmed by DNA sequencing. 

Eco//SMrvival Assay 
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E. coli strain BL21 carrying plasmid pRep4 (from Qiagen, for 
constitutive expression of the lac repressor) were made competent, transformed 
with the appropriate DNA constructs and washed twice with minimal medium 
before plating on minimal medium plates containing 50 mg/ml kanamycin, 100 
5 mg/ml ampicillin and 0.5 mg/ml trimethoprim. One half of each transformation 
mixture was plated in the absence, and the second half in the presence, of 1 nrM 
IPTG. All plates were placed at 37°C for 66 hrs. 
g. coli Growth Curves 

Colonies obtained from cotransformation were propagated 

10 and used to inoculate 10 ml of minimal medium supplemented with ampicillin, 
kanamycin as well as IPTG (1mM) and trimethoprim (1 \iql\A) where indicated. 
Cotransformants of Z-F[1,2] + Z-F[3:lle114Gly] were obtained under non- 
selective conditions by plating the transformation mixture on L-agar (+ 
kanamycin and ampicillin) and screening for the presence of the two constructs 

15 by restriction analysis. All growth curves were performed in triplicate. Aliquots 
were withdrawn periodically for measurement of optical density. Doubling time 
was calculated for early logarithmic growth (OD 600 between 0.02 and 0.2). 
Protein Overexpression and Purification 

Bacteria were propagated in Terrific Broth? 1 in the presence 

20 of the appropriate antibiotics to an OD600 of approximately 1 .0. Expression was 
induced by addition of 1 mM IPTG and further incubation for3 hrs. For analysis 
of crude extract, pellets from 150 ml of induced cells were lysed by boiling in 
loading dye. The lysates were clarified by microcentrifugation and analyzed by 
SDS-PAGE32. For protein purification, a cell pellet from 50 ml of inducedE coli 

25 cotransformed with constructs Z-F[1 2] and Z-F[3] was lysed by sonication, and 
a denaturing purification of the insoluble pellet undertaken using NMMTA 
(Qiagen) as described by the manufacturer. The proteins were eluted with a 
stepwise imidazole gradient. The fractions were analyzed by SDS-PAGE. 
RESULT? 

30 Desion of mDHFR fragm ents for a PCA 

mDHFR shares high sequence identity with the human DHFR 
(hDHFR) sequence. As the coordinates of the murine crystal structue were not 
available, the design considerations were based on the hDHFR structure. DHFR 
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has been described as comprising three structural fragments forming two 
domains: the adenine binding domain (F[2J) and a discontinuous domain (F[1J 
and F[3]) 13 18 . The folate binding pocket and the NADPH binding groove are 
formed mainly by residues belonging to F[1] and F[2]. Residues 101 to 108 of 
5 hDHFR form a disordered loop which lies on the same face of the protein as boti 
termini. This loop occurs at the junction between F[2] and F[3]. By cleaving 
mDHFR at residue 107. F[1,2] and F[3] were created, thus causing minimal 
disruption of the active site and substrate binding sites. The native /V-terminus 
of mDHFR and the novel AMerminus created by cleavage were covalently 

10 attached to the C-termini of GCN4 leucine zippers (Fig. 1). 
g. co// Survival Assay? 

Figure 2 illustrates the general features of the expressed 
constructs and the nomenclature used in this study. Figure 3 (panel A) illustrates 
the results of cotransformation of bacteria with constructs coding for Z-F[1 ,2] and 

1 5 Z-F[3], in the presence of trimethoprim, clearly showing that colony growth under 
selective pressure is possible only in cells expressing both fragments of mDHFR 
There is no growth in the presence of either Z-F[1 ,2] or Z-F[3] alone. Induction 
of protein expression with IPTG is essential for colony growth (Fig. 3A). The 
presence of the leucine zipper on both fragments of mDHFR is essential as 

20 illustrated by cotransformation of bacteria with both vectors coding for mDHFR 
fragments, only one of which carries a leucine zipper (Fig. 3A). It should be 
noted that growth of control £ coli transformed with the full-length mDHFR is 
possible in the absence of IPTG due to low levels of expression in uninduced 
cells. 

25 Confirmation of the presence of both plasmids in bacteria, 

now able to grow with trimethoprim, was obtained from restriction analysis ofthe 
plasmid DNA purified from isolated colones. Figure 4 (A) reveals the presence 
ofthe 1200 bp Hindi restriction fragment from construct Z-F[1,2] as well as the 
487 and 599 bp Hindi restriction fragments from construct Z-F[3]. Also present 

30 is the 935 bp Hindi fragment of pRep4. Overexpression of the fusion proteins 
is illustrated in Figure 4 (B). In all cases, overexpression of a protein of the 
expected molecular weight is apparent on SDS-PAGE of the crude lysate. 
Purification of the coexpressed proteins under denaturing conditions yielded ivo 
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bands of apparent homogeneity upon analysis by Coomassie-stained SDS- 
PAGE (Fig. 4B). 
Stability Mutants 

Applicants generated mutants of F[3] to test whether 
5 reconstitution of mDHFR activity by fragment assembly was specific. Protein 
stability can be reduced by changing the side-chain volume in the hydrophobic 
core of a protein 9 , 22 * 25 . Residue Ile1 14 of mDHFR occurs in a core (J-strand at 
the interface between F[1,2] and F[3] t isolated from the active site, lie 114 is in 
van der Waals contact with Ile51 and Leu93 in F[1 ,2]". He 1 14 was mutated to 

10 Val t Ala, or Gly. Figure 3 (panel B) illustrates the results of cotransformation of 
£ coli with construct Z-F[1,2] and the mutated Z-F[3] constructs. The colonies 
obtained from cotransformation with Z-F[3:lle114Ala] grew more slowly than 
those cotransformed with Z-F[3] or Z-F[3:lle1 14Val] (see inset to Fig. 3B). No 
colony growth was detected in cells cotransformed with Z-F[3:lle1 14Gly]. The 

15 number of transformants obtained was not significantly different in the case 
where colonies were observed, implying that cells cotransformed with Z-F[1,2] 
and either Z-F[3], Z-F[3:lle114Val] or Z-F[3:lle114Ala] have an equal survival 
rate. Overexpression of the mutants Z-F[3:lle1 14X] was in the samerange as 
Z-F[3], as determined by Coomassie-stained SDS-PAGE (data not shown). 

20 The relative efficiency of reassembly of mDHFR fragments 

was also compared by measuring the doubling time of the cotransformants in 
liquid medium. Doubling time in minimal medium was constant for all 
transformants (data not shown). Selective pressure by trimethoprim in the 
absence of IPTG prevented growth of £ coli except when transformed with 

25 pQE-1 6 coding for full-length DHFR due to low levels of expression in uninduced 
cells. Induction of mDHFR fragment expression with IPTG allowed survival of 
cotransformed cells (except in the case of Z-F[1 t 2] + Z-F[3:lle1 14Gly], although 
the doubling times were significantly increased relative to growh in the absence 
of trimethoprim. The doubing time measured for cells expressing Z-F[1,2] + Z- 

30 F[3], Z-F[1 ,2] + Z-F{3:lle1 14Val] and Z-F[1 ,2] + Z-F[3:lle1 14Ala] were 1 .6-fold, 
1.9-fold and 4.1-fold, higher respectively, than the doubling time of£ coli 
expressing pQE-16 in the absence of trimethoprim and IPTG. The presence of 
IPTG unexpectedly prevented growth of £ coli transformed with full-length 
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mDHFR. Growth was partially restored by addition of the folate metabolism end- 
products thymine, adenine, pantothenate, glycine and methionine (data not 
shown). This suggests that induced overexpression of mDHFR was lethal to E. 
coli when grown in minimal medium as a result of depletion of the fofete pool by 
5 binding to the enzyme. 

In another embodiment, applicants make pint mutations in 
the GCN4 leucine zipper of Z-F[1 ,2] and Z-F[3], for which direct equilibrium and 
kinetic parameters are known and correlating these known values with 
parameters derived from the PCA (Pelletier and Michnick, in preparation). 

1 0 Comparison of cell growth rates in this model system with rates for a DHFR POK 
using unknowns would give an estimate of the strength of the unknown 
interaction. This should enable the determination of estimates of equilibrium and 
kinetic parameters for a specific protein-protein interaction. 

The present invention has illustrated and demonstrated a 

15 protein-fragment complementation assay (PCA) based on mDHFR, where a 
leucine zipper directs the reconstitution of DHFR activity. Activity was detected 
by an E coli survival assay which is both practical and inexpensive. This system 
illustrates the use of mDHFR fragment complementation in the detection of 
leucine zipper dimerization and could be applied to the detection of unknown, 

20 specific protein-protein interactions in vivo. 

E. coli Aminoglycoside kinase: Optimization and Design of a PCA using ar 
Exonuclease-Molecular Evolution Strategy 

Although applicants have demonstrated that the 
engineering/design strategy described above can be used to produce 

25 complementary enzyme fragments, it is obvious that proteins did not evolve in 
such a way that such fragments would be expected to have optimal physical 
characteristics, including solubility, foldability (fast folding), protease resistance, 
or enzymatic activity. An alternative embodiment to the engineering/design 
strategy is the endonuclease/evolution approach. This strategy can be used by 

30 itself or in conjunction with the engineering/design strategy. The advantages of 
this approach are that in principle, prior knowledge of the protein structure is not 
necessary, that the optimal fragments are chosen for PCA and that these 
fragments will also have optimal characteristics. Following selection of optimal 
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complementary fragments, the fragments are exposed to multiple rounds of 
random mutagenesis. Mutagenesis is achieved by suboptimal PCR combined 
with chemical mutagenesis or DNA shuffling (Stemmer, W. P. C. 1994, Proc, 
Natl, Acad, Sci. USA 21: 10747-1 0751). The overall strategy is described forthe 
5 case of aminoglycoside kinase (AK), an example of antibiotic resistance marler 
that can be used for dominant selection of prokaryotic cells such as £. coli or 
eukaryotic cells such as yeast or mammalian cell lines. The structure of AK is 
already known, and so strategy (1) would be possible, however a combination 
of strategy (1) as defined for DHFR above, in conjunction with strategy (2) was 
10 chosen. 

EXPERIMENTAL PROTOCOL 

The optimization/selection procedure is as follows. 

Generation of of library of AK fragments based on Products of 
Exonuclease digestion 

1 5 Nested sets of deletions are created at the 5' and the 3' encfe 

of the AK gene. In order to create unidirectional deletions, unique restriction 
sites are introduced in the regions flanking the AK gene. At the 5' and 3* termini 
an "outer" sticky site with a protruding 3' terminus (Sph I and Kpn I, respectivdy) 
and an Inner" sticky site with recessed 3' terminus (Bgl II and Sal I, respectively 

20 are added by PCR. Cleavage at Sph I and Bgl II (or Kpnl and Sal I) results in 
creation of a protruding terminus leading back to the flanking sequence and a 
recessed terminus leading into the AK gene. Digestion with £ coli exonuclease 
III and S1 nuclease (Henikoff, S. 1987, Methods in Enzymology 155: 156-1 65) 
yields a set of nested deletions from the recessed terminus only. Thus, 10mg of 

25 DNA is digested with Sph I and Bgl II (or Kpn I and Sal I), phenol-chloroform 
extracted, and 12.5 U exonuclease III added. At 30 sec intervals over 10 min. 
aliquots are taken and put into solution with 2 U S1 nuclease. The newly created 
ends are filled in with T4 DNA polymerase (0.1 U per sample) and the set of 
vectors closed back by blunt-ended ligation (10 U ligase per sample). The 

30 average length of the deletion at each time point is determined by restriction 
analysis of the sets. This yields sets of AK genes deleted from the 5' or the 3' 
termini. This manipulation is undertaken directly in the pQE-32-Zipper 
constructs, such that the products can be used directly in activity screening. 
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Screening for AK activity 

As a first step in determining the requirements for fragment 
complementation, the minimum AMerminal and C-terminal fragments of AK that, 
alone, are active must be determined. Sets of deletions are individually 
5 transformed into £ coli BL21 cells and expression of the AK fragments is 
induced by IPTG. The sets where a significant number of colonies appear in the 
presence of G418 serve to indicate the approximate length of N- and C-terminal 
AK fragments which retain activity. Fragment complementation must therefore 
be undertaken with fragments taken from within these limits. The zipper-directed 

10 fragment complementation is detected as follows: appropriate sets of deletions, 
or pools of sets, are cotransformed into BL21 , expression is induced with IPTG 
and growth in the presence of varying G418 concentrations is monitoed. Large 
colonies which grow in the presence of high G418 concentrations are selected 
as giving the most efficiently complementing products. 

15 Directed evolution of optimal AK fragments using "DNA shuffling 

After optimal fragments have been selected, the individual 
fragments are removed by restriction digestion at Sph I aid Kpn I allowing for 5' 
and 3' constant priming regions flanking the N- or C-terminal complementary 
fragments of AK. These oligonucleotides (2-4 \ig) are digested with DNasel 

20 (0.005 units/ul, 100 ul) and fragments of 10-50 nucleotides are extracted from 
low melting point agarose. PCR is then performed with the fragmented DNA, 
using Taq polymerase (2.5 units/ul) in a PCR mixture containing 0.2 mM 
dNTPs, 2.2 mM Mg^l (or 0 mM for subuptimal PCR), 50 mM KCI, 10 mM 
Tris.HCI, pH 9.0, 0.1%TritonX-100. A PCR program of 94°C/60 sec; 94°C 30 

25 sec; 55°C 30 sec; 72"C 30 sec. times 30 to 50; 72 5 C 5 min. Samples are taken 
every 5 cycles after 25 cycles to monitor the appearance of reassembled 
complete fragments on agarose gel. The primeriess PCR product isthen diluted 
1:40 or 1:60 and used as template for PCR with 5\ 3' complementary constant 
region oligos as primers for a further 20 cycles. Final product is restriction 

30 digested with Sph I and Kpn I and the products subcloned back into pQE32- 
Zipper to yield the final library of expression plasmids. As before, £ coli BL21 
cells are sequentially transformed with C-terminal or AMerminal complementary 
fragment-expression vectors at an estimated efficiency of 1CP and finally cells 
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cotransformed with the complementary fragment. E. coli are grown on agarose 
plates containing 1 ug/ ml G418 and after 16 hours the largest colonies are 
selected and grown in liquid medium at increasing concentrations of G418. 
Those clones showing the maximal resistance to G418 are then selected and if 
5 maximum resistance or greater is reached the evolution is terminated. 
Otherwise the DNA shuffling proceedure is repeated. Finally, optimalfragments 
are sequenced and physical properties and enzymatic activity are assessed. 
This optimized AK PCA is now ready to test for dominant selection in any other 
cell type including yeast and mammalian cell lines. This strategy can be used 

10 to develop any PCA based on enzymes that impart dominant or recessive 
selection to a drug or toxh or to enzymes that produce a colored or fluorescent 
product. In the later two cases the end point of the evolution process is at 
minimum, reatainment of signal for the intact, wild type enzyme or enhancement 
of the signal. This strategy can also be used inthe absense of knowledge of the 

1 5 enzyme structure, whether the enzyme has a mono-, di- ormultimeric structure. 
However, knowledge of the enzyme structure does not preclude applying this 
strategy as well, as described below. 

As can be appreciated, knowledge of the enzyme structure 
can provide a more efficient way of using molecular evolution to design a PCA. 

20 In this case, the enzyme structure is used to define minimal domains of the 
protein in question, as was done for DHFR. Instead cf generating fragments of 
completely random length for the N- or C-terminal fragments, fragments that at 
a minimum will code for one of the two domains are selected during the 
exonuclease phase. For instance, in the case of AK, two well defined domains 

25 can be discerned in the structure consisting of residues 1 -94 in the AMerminus 
and residues 95-267 in the C-terminus. Endonuclease digestions are performed 
as above, but reaction products are selected that will minimally code for one of 
the two domains. These are then the starting points for fragment selection and 
evolution cycles as described above. 

30 

Heteromeric Enzvme PCA 

A further embodiment of the invention relates to PCA based 
on using heterodimeric or heteromultimeric enzymes in which the entire catalytc 
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machinery is contained within one independently folding subunit and the other 
subunit provides stability and/or a cofactor to the enzymatic subunit. In this 
embodiment of PCA. the regulatory subunit is split into complementary fragment 
and fused to interacting proteins. These fragments are co- 
5 transformed/transfected into cells along with the enzyme subunit. As wih single 
enzyme PCA, described for DHFR and AK. reconstitution and detection of 
enzyme activity is dependent on oligomerization domain-assisted reassembly of 
the regulatory subunit into its native topology. However, the reconstituted 
subunit then interacts with the intact enzymatic subunitto produce activity. This 

1 0 approach is reminiscent of the USPS system, except that it has the advantage 
that, in this case, the enzyme is not a constitutive cellular enzyme, but an 
exogenous gene product. As such there is no problem with background activity 
from the host cell, the enzyme can be expressed at higher levels than a natural 
gene and can also be modified to be directed to specific subcellular 

15 compartments (by subcloning compartment-specific signal peptides onto the A/- 
or C-termini of the enzyme and subunit fragments). The specific advantage of 
this approach is that while the single enzyme strategy may lead to suboptimal 
enzymatic activity, in this approach, the enzyme folds independently and may in 
fact act as a chaperone to the fragmented regulatory subunit, aiding in its 

20 refolding. In addition, folding of the fragments may not need be complete in 
order to impart regulation of the enzyme. This approach is realized by a 
colorimetric/fluorometric assay that was developed and based on the 
Streptomyces tyrosinase. This enzyme catalyzes the conversion of tyrosine to 
deoxyphenylalanine (DOPA). The reaction can be measured by conversion of 

25 fluorocinyl-tyrosine to the DOPA form. The active enzyme consists of two 
subunits. the catalytic domain (Melc2) and a copper binding domain (Meld). 
Meld is a small protein of 14 kD that is absolutely required for Melc2 activity. 
In one assay which was developed, the Meld protein is split into two fragments 
that serve as the complementation part of the PCA. These fragments, fused to 

30 oligomerization domains, are coexpressed with Melc2, and the basis of the ass%f 
is that Melc2 activity is dependent on complementation of the Meld fragments. 
Stoichiometrics of protein complexes can also be addressed (i.e. whether a 
complex consists of two or three protehs) as follows. One fuses two proteins to 
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the two Meld fragments and a third tointact Melc2. It thus can be shown that the 
minimum complementary active complex of the tyrosinase will require that all 
three components and therefore a trimer is necessary. A key aspect of this 
approach is that specific interactions can easily be demonstrated by making one 
5 component, specifically the protein-Melc2 fusions' catalytic subunit, dependent 
on the other components, by underexpressing it in the background of 
overexpressed Meld fragment-protein fusions. 
Mnltiprier Disruption-Based PCA 

Although applicants have described only fragment 

10 complementation of intact proteins, protein domains or subunits as comprising 
PCA, an alternate embodiment relates to PCAs based on the disruption of the 
interface between, for instance a dimeric enzyme that requires stable associatioi 
of the subunits for catalytic activity. In such cases, selective or random 
mutagenesis at the subunit interface would disrupt the interaction and the basis 

1 5 of the assay would be that oligomerization domains fused to the subunits would 
provide the nessesary binding energy to bring the subunits together into a 
functional enzyme. 

Vprtnr Design in A pplication to PCAs 

The PCA strategies listed thus far have used two-plasmid 

20 transformation strategies for expression of complementary fragments. This 
approach has some advantages, such as using different drug resistance markes 
to select for optimal incorporation of genes, for instance, in transformed or 
transfected cells or for optimum transformation of complementary plasmids into 
bacteria and control of expression levels of PCA fragments using different 

25 promoters. However, single plasmid strategies have advantages in terms of 
simplicity of transfection/ transformation. Protein expression levels can be 
controlled in different ways, while drug selection can be achieved in one of two 
ways: In the case of PCAs based on survival assay using enzymesthat are drug 
resistance markers themselves, such as AK. or where the enzyme complements 

30 a metabolic pathway, such as DHFR, no additional drug resistance gene need 
be incorporated in the expression plasmids. If however the PCA is based on an 
enzyme that produces a colored or fluorescent product, such as tyrosinase or 
firefly luciferase. an additional drug resistance gene must be expressed from the 
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plasmid. Expression of PCA complementary fragments and fused cDNA 
libraries/target genes can be assembled on single plasmids as individual 
operons under the control of separate inducible or constitutive promoters, or can 
be expressed polycistronically. In E. co//", polycistronic expression can be 
5 achieved using known intercoding region sequences. For instance, the region 
in the mel operon from which the tyrosinase meld-melc2 genes were derived 
can be used. Indeed, this region was shown to be expressed at high levels in£. 
coli under the control of a strong (tac) promoter. Genes could also be expressed 
and induced by independent promoters, such as tac and arabinose. For 

10 mammalian expression systems, single plasmid systems can be used for both 
transient or stable cell line expression and for constitutive or inducible 
expression. Further, differential control of the expression of one of the 
complementary fragment fusions, usually the bait-fused fragment, can be 
controlled to minimize expression. This will be important in reducing background 

15 non-specific interactions. Examples of differential control of complementary 
fragment expression include the following strategies: 

i) In polycistronic expression, transient or stable, expression of the second gene 
will necessarily be less efficient. This in itself could thus serve to limit the 
quantity of one of the complementary fragments. Alternatively, the expression 

20 of the first gene product can be limited by mutation of an upstream donor/splice 
site, while the second gene can be put under the control of a retroviral internal 
initiation site, such as that of ECMV of poliovirus, to enhance expression. 

ii) Individual complementary fragment-fusion pairs can also be put under the 
control of commercially available inducible promoters, including for example 

25 those based on Tet-responsive PhCMV*-1 promoter, and/or steroid receptor 
response elements. In such a system the two complementary fragment genes 
can be turned on and expression levels controlled by dose dependent 
expression with the inducer (tetracycline and steroid hormones, respectively). 

30 

EXAMPLE 2 

Applications of the PCA strategy to detect novel gene products in 
biochemical Pathways and to map such pathways 
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Among the greatest advantage of PCA over other molecular 
interaction screening methods is that they are designed to be versatile enough 
to enable a performance both in vivo and in any type of cell. This feature is 
crucial if the goal of applying a technique is to identify novel interactions from 
5 libraries and simultaneously be able to determine if the interactionsobserved are 
biologically relevant. The detailed example given below, and other examples at 
the end of this section illustrate how it is that validation of interactions with PCA 
is possible. In essence, this is achieved as follows. In biochemical pathways, 
such as hormone receptor-mediated signaling, a cascade of enzyme-mediated 

10 chemical reactions are triggered by some molecular event (i.e. hormcne binding 
to its membrane surface receptor). Enzyme interactbns with protein substrates 
and protein-protein or protein-nucleic acid interactions with enzyme-modified 
substrates then occurs. Such biochemical signaling cascades generally only 
occur in specific cell types and model cell lines for studying these processes. 

15 Therefore, to detect induced interactions, such as with known proteins in a 
pathway with yet unidentified proteins, one obviously needs to perform such 
screening in appropriate model cell lines and in the correctcellular compartment 
Only the PCA strategy can be used in a general way to do this. Protein- 
molecular interaction techniques such as yeast two- or three-hybrid techniques 

20 cannot be performed in a context where such events occur, except inthe limiting 
case of nuclear interaction in yeast or interactions that are not triggered. There 
do exist mammalian two-hybrid techniques making it possible to detect induced 
protein interactions, but only if the proteins involved can be simultaneously 
activated, transported to the nucleus and interact with their partners. PCAs do 

25 not have these limitations since they do not require additional cellular machinery 
available only in specific compartments. A further point is that by performing the 
PCA strategy in appropriate model cell types, it is also possible to introduce 
appropriate positive and negative controls for studying a particular pathway. For 
instance, for a hormone signaling pathway, it is likely that hormone signaling 

30 agonists and antagonists or dominant-negative mutants of signaling cascade 
proteins that are upstream or act in parallel to the events being examined in the 
PCA would be known. These reagents could be used to determine if novel 
interactions detected by the PCA are biologically relevant. In general then, 
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interactions that are detected only if a hormone is introduced, but are not seen 
if an antagonist is simultaneously introduced could be hypothesized to represert 
interactions relevant to the process under study. 

Below is a detailed description of an application of the DHFR 
5 that illustrates these points, as well as further examples where thePCA strategy 
could be used. 

Application of the DHFR PCA to Mapping Growth Factor-Mediated Signal 
Transduction Pathways 

One of the earliest detectable events in growth factor- 

10 activated cell proliferation is the serine phosphorylation of the S6 protein of the 
40S ribosomal subunit. The discovery of serine/threonine kinases that 
specifically phosphorylate S6 have considerably aided in identifying novel 
mitogen mediated signal transduction pathways. The serine/threonine kinase 
p70S6k has been identified as a specific S6 phosphorylase 131 ' 136 . p70S6k is 

15 activated by serine and threonine phosphorylation at specific sites in response 
to several mitogenic signals including serum in serum starved cells, growth 
factors including insulin and PDGF, and mitogens such as phorbol esters. 
Considerable effort has been made over the last five years to determine how 
p70/p85S6k are activated in response to mitogens. Two receptor-mediated 

20 pathways have been implicated in p70S6k activation, one associated with the 
phosphatidylinositol-3-kinase (Pl(3)k) and the other with the Pl(3)k homologue 
mTOR 137 ' 144 . Key to understanding of this proposal, is the fact that the role of 
these enzymes in activation of p70S6k was determined by effects of two natural 
products on phosphorylation and enzyme activity: rapamycin, which indirectly 

25 inhibits mTOR activity, and wortmannin, which directly inhibits Pl(3)k activity. It 
is also important to note that no direct upstream kinases or other regulatory 
proteins of p70S6k have been identified to this date. 

The interactions of p70S6k with its known substrate S6 can 
be studied as a test system for the DHFR PCA in£ coli and in mammalian cell 

30 lines. One can also seek to identify novel interactions with this enzyme that 
would lead to new insights into how this important enzyme is regulated. Also, . 
since activation of the enzyme is mediated by multiple pathways that can be 
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selectively inhibited with specific drugs, this is an ideal system to test PCAs as 
methods to distinguish induced versus constitutive protein-protein interactions. 
al T^tino of the £ coli su rvival assay: Interaction of P 70S6K with S6 

This test is ideal, because the apparent Km (= 250 nM) of 
5 p70S6k for S6 protein 145 is approximately the same as the Kd forleucine zipper- 
forming peptides from GCN4 used in the test systemherein disclosed. However, 
a constitutively active form of the enzyme was used forthe following tests. An 
AMerminal truncated form of the enzyme D77-p70S6k, is constitutively active and 
will be used in these studies147. 

1 o Methodology: D77-p70S6k-F[1 ,2] fusion and D77-p70S6k-F[3] 

fusion, or F[1 ,2] and D77-p70S6k-F[3] fusion (as a control) will be cotransformed 
into £ coli and the cells grown in minimal medium in the presence of 
trimethoprim. Colonies will be selected and expanded for analysis of kinase 
activity against 40S ribosomal subunits, and for coexpression of the two proteina 

15 b) Modification of the b acterial survival assay for library screening: 
Identification of Nov *' Interacting Proteins 

Screening an expression library for interactions with a given 
target (p70S6k-D77, in this case) will be straightforward in this system, given thst 
the only steps involved are: 1 -construction of the fusion-expression library as a 

20 fusion with mDHFR fragment[3]; 2-transformation of the library in£ coli BL21 
harboring pRep4 (for constitutive expression of the lac repressor; this is required 
in the case of a protein product which is toxic to the cells) and a plasmid coding 
forthe fusion: p70S6k-D77-[1,2); 3-plating on minimal medium in the presence 
of trimethoprim and IPTG;4-selection of any colonies that grow, propagation and 

25 isolation of plasmid DNA, followed by sequencing of DNA inserts; 5-purification 
of unknown fusion products via the hexaHis-tag and sizing on SDS-PAGE. 
Methodology; 

The overall strategy is illustrated in Figure 5. 1 -Construction 
of a directional fusion-expression library: i-cDNA production: One can isolate 
30 poly(A)+ RNA from BA/F3 cells (B-lymphoid cells) because these cells have 
successfully been used in the study of the rapamycin-sensitive p70S6k activation 
cascade 139 . To enrich for full-length mRNA, the mRNA will be affinity-purified va 
the 5' cap structure by the CAPture method* 48 . Reverse transcription will be 
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primed by a Tinker Primer'*: it has a poly(T) tail to prime from the polyft) mRNA 
tail, and an Xhol site for later use in directional subcloning of the fragments. The 
first strand is then methylated. After second strand synthesis and blunting of the 
products, "EcoRI Adapters" are added. Digestion of the linkers with EcoRI and 
5 Xhol (the inserts are protected by methylation) produces full-length cDNA reacV 
for directional insertion in a vector opened with EcoRI and Xhol. Since the 
success of library screening depends largely on the quality of the cDNA 
produced, the above methods will be used as they have proven to consistently 
produce high-quality cDNA libraries, ii-lnsertion of the cDNA into vectors: The 

1 0 library will be constructed as a C-terminal fusion to mDHFR F[3] in vector pQE- 
32 (Qiagen), as high levels of expression of mDHFR fusions were obtained fron 
this vector in BL21 cells. Three such vectors will be created, differing at their 3' 
end, which is the novel polycloning site that was engineered (described earlier, 
under Methods), carrying either 0, 1 t or 2 additional nucleotides. This allows 

15 read-through from F[3] into the library fragments in all 3 translational reading 
frames. The cDNA fragments will be directionally inserted at theEcoRI and Xhol 
sites in all three vectors at once. 2, 3, 4, and 5- These steps have been 
described earlier, under Results, apart from the final sequencing of clones 
identified using sequencing primers specific to vector sequences flanking sites 

20 of library insertion. The protein purification will also be as described earlier, by 
a one-step purification on Ni-NTA (Qiagen). If the product size is more than 15 
kDa over the molecular weight of the DHFR component (equal to a cDNA insert 
of more than 450bp), the inserts will have to be sequenced (i.e. at the Sheldon 
Biotechnology Center, McGill University). 

25 c) Development of the Eukarvotic Assay 

The transformation of the system described above, is useful to produce an 
equivalent assay for use in eukaryotic cells. The basic principle of the assay is 
the same: the fragments of mDHFR are fused to associating domains, and 
domain association is detected by reconstitution of DHFR activity in eukaryotic 

30 cells (Figure 5). 

Creation of the expression constructs 

The DNA fragments coding for the GCN4-zipper-mDHFR 
fragment fusions were inserted as one piece into pMT3. a eukaryotic transient 
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expression vector 126 . Expression of the fusion proteins in COS cells was 
apparent on SDS-PAGE after 35[S]Met labeling. 
Survival assays in eufraryotic celts 

Two systems can be used for detection of mDHFR 
5 reassembly, in parallel: i- CHO-DUKX B1 1 cells (Chinese Hamster Ovary cell 
line deficient in DHFR activity) are cotransfected with GCN4-zipper-mDHFR 
fragment fusions. The cells are grown in the absence of nucleotides; only cells 
carrying reconstituted DHFR will undergo normal cell division and colony 
formation, ii- Methotrexate (MTX)-resistant mutants of mDHFR have been 

10 created, with the goal of transfecting cells that have constitutive DHFR activity 
such as COS and 293 cells. F[1 ,2] were mutated in orcfer to incorporate, one at 
a time, each of five mutations that significantly increase the Ki (MTX): Gly15Trp, 
Leu22Phe, Leu22Arg, Phe31Ser and Phe34Ser (numbering according to the 
wild-type mDHFR sequence). These mutations occur at varying positions 

15 relative to the active site and relative to F[3], and have varying effects on Km 
(DHF), Km (NADPH) and Vmax of the fulHength mammalian enzymes in which 
they were. Mutants Z-F[1,2: Leu22Phe], Z-F[1,2: Leu22Arg] and Z-F[1,2: 
Phe31Ser] all allowed for bacterial survival with high growth rates when 
cotransformed with Z-F[3] (results not shown). The five mutants will be tested in 

20 eukaryotic cells, in reconstitution of mDHFR fragments to produce enzyme that 
can sustain COS or 293 cell growth while under the selective pressure of MTX, 
which will eliminate background due to activity of the native enzyme. The 
mutations offers an advantage in selection while presenting no apparent 
disadvantage with respect to reassembly of active enzyme. If the reconstituted 

25 mDHFR produced in either of the survival assays allows eukaryotic cell growth 
that is significantly slower than growth with the wild-type enzyme, thymidylate wil 
be added to the growth medium to partially relieve the selective pressure offered 
by the lack of nucleotides. 
d) Testing of the eukaryotic surviv al assay 

30 it is necessary at the outset to test whether induced 

interactions with p70S6k can be detected. One can use the sametest system 
as that for the £. coli test system described above: Induction of association of 
p70S6k with S6 protein. 
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Methodology: 

mDHFR Leu22Phe mutant S6-F[1,2] and p70S6k-F[3] ? or 
F[1 ,2] and p70S6k-F[3] (as a control) will be cotransfected into COS cells and tte 
cells will be serum starved for 48 hours followed by replating of cells at low 
5 density in serum and MTX. Colonies will be selected and expanded foranalysis 
of kinase activity against 40S ribosomal subunits. and for coexpression of the 
two proteins. Further controls will be performed for inhibition of protein 
association with wortmannin and rapamycin. 

e) Modification of the eukaryotic survival assay for library screening 
1 o An important part of the work required in creating a library fa 

use in eukaryotic cells will have been accomplished already, as the EcoRi/Xhol 
directional cDNA produced by the Stratagene "cDNA Synthesis Kit" can directly 
be inserted directionally into the Stratagene Zap Express vector. 
Methodology: 

1 5 Steps 1 through 5 are parallel to those for the bacterial library 

screening (above). 1 -Again, the library is constructed as a C-terminal fusion to 
mDHFR F[3]. F[3] (with no stop codon) will be inserted in frame in Zap Expres, 
followed by insertion of the novel polylinkers allowing expression of the inserts 
in all three reading frames (described above), and by the EcoR/Xhol directional 

20 cDNA. This bacteriophage library will be propagated and treated with the 
Stratagene helper phage to excise a eukaryotic expression phagemid vector 
(pBK-CMV) carrying the fusion inserts. 2-Cotransfection of the library and 
p70S6k-F[1,2] constructs in eukaryotic cells: the screening in COS or 293 cells 
was performed, as these are responsive to serum in activating the p70S6k 

25 signaling pathway. Selection experiments will be performed as described for the 
S6 test system above. 3-Propagation, isolation and sequencing of the insert 
DNA will be undertaken. 4-The cloned fusion proteins will be sized on SDS- 
PAGE by direct visualization after 35S-Met/Cys labeling, or by Western blotting 
using a commercial polyclonal antibody to mDHFR. 

30 Generalization of the Strategy 

The scheme for detecting partners for the protein p70S6k can 
be applied to studies of any biochemical pathway in any living organism. Such 
pathways may also be related to disease processes. The disease-related 
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pathway may be an intrinsic process of cells in humans in which a pathology 
arises from, for instance mutation, deletion or under or over expression of a 
gene. Alternatively the biochemical pathway may be one that is specific toa 
pathogenic organism or the mechanism of host invasion. In this case, 
5 component proteins of such processes may be targets of a therapeuticstrategy, 
such as development of drugs that inhibit invasion by the organism or a 
component enzyme in a biochemical pathway specific to the pathogenic 
organism. 

Inflammatory diseases are a case in point that can concern 

10 both examples. The protein-protein interactions that mediate the adhesion of 
leukocytes to inflamed tissues are known to involve such proteins as vascular 
cell adhesion molecule- 1 (VCAM-1), and certain cytokines such as IL-6 andlL-8 
that are produced during inflammation. However, many rf the proteins involved 
in onset of inflammatory response remain unknown; further, the intracellular 

15 signaling pathways triggered by the extracellular associations are poorly 
understood. The PCAs could be used in elucidation of the mechanisms 
underlying the onset of inflammation, as well the ensuing signaling. For 
example, signaling pathways associated with inflammation, such as those 
mediated by IL-1 . IL-6, IL-8 and tumor necrosis havebeen studied in some detaB 

20 and many direct and downstream regulators are known. These regulators can 
be used as starting point targets in a PCA screening to identify other signalling 
or modulating proteins that could also be targets for drug development. 

There is an increased risk of infection by enteric pathogens 
in the occurrence of the intestinal inflammation that characterizes idiopathic 

25 intestinal diseases. There are two mechanisms which need to be better 
understood here and which can be addressed by PCA: 
i) the cellular mechanisms of inflammation as described above, and ii) 
the discovery of the specific cell-surface ligands which the pathogenic 
organisms recognize and associate with. Secreted proteins produced by the 

30 pathogen can bind to the basolateral membrane of epithelial cells (as in the case 
in Yersinia pseudotuberculosis infection) or be translocated into intestinal 
epithelial cells (Salmonella infection), promoting infectivity and/or physiological 



SUBSTITUTE SHEET (RULE 26) 



WO 00/07038 



PCT/CA99/00702 



44 

responses to the infection. However, in most cases the interactions between the 
pathogenic protein and the epithelial cells are unknown. 
Cell adhesion and nervous system regeneration 

A related example in cell adhesion includes processes 
5 involved in develoment and regeneration in the nervous system. Cadherens are 
membrane proteins'that mediate calcium dependent cell-cell adhesion. To do so 
they need another class of cytoplasmic proteins called cathenins. Those make 
a bridge between cadherins and cytoskeleton. Cathenins also regulate genes 
that control differentiation-specific genes. For instance, the protein B-cathenin 

10 can interact in certain situation with a transcription factor (lef-1) and be 
translocated into the nucleus where it constrains the number of genes 
transactivated by lef-1 (differentiation). This process is regulated by the Wnt 
signaling pathway (homologs to the wingless pathway in drosophila) by 
inactivation of GSK3B which permit degradation of APC (a cytoplasmic adapter 

15 protein). PCA strategies could be used to identfy novel proteins involved in the 
regulation of these processes. 

Proteins involved in viral integration processes are examples 
of targets that could be tested for inhibitors using the PCA strategies. Examples 
for the HIV virus include: 

20 i) inhibition of integrase or the transport of the pre-integration complex: protein 
Ma or vpr. 

ii) Inhibition of the cell cycle in G2 by vpr (interaction by cyclin B) causing 
induction of apoptosis. 

iii) Inhibition of the interaction of gp160 (precursor of the membrane proteins) 
25 with furine. 

Accessory proteins of HIV as a therapeutic target: 

i) Vpr: nuclear localizing sequence (target): interaction site of vpr with 
phosphatasesA . 

ii) vif: interaction with vimentin (cytoskeleton associated protein). 

30 ii) Vpu: Degradation of CD4 in the RE mediated by the cytoplasmic tail of Vpu. 

iii) nef: Myristoylation signal of Net 

EXAMPLE 3 
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Other Examples of Protei n Fragment Complementation Assays 

Other examples of assays are herein examplified. The reason 
to produce these assays is to provide alternative PCA strategies that would be 
appropriate for specific protein association problems such as studying 
5 equilibrium or kinetic aspects of assembly. Also, it is possible that in certain 
contexts (for example, specific cell types) or for certain applications, a specific 
PCA will not work but an alternative one will. Brief descriptions of each other 
PCAs embodiments are provided hereinbelow. 

1) Glutathione-S-Transferase (GST) 

1 o GST from the flat worm Schistosoma japonicum is a small (28 

kD), monomeric. soluble protein that can be expressed in both prokaryotic and 
eukaryotic cells. A high resolution crystal structure has been solved and serves 
as a starting point for design of a PCA. A simple and inexpensive colorimetric 
assay for GST activity has been developed consisting of the reductive 

1 5 conjugation of reduced glutathione with 1-chloro-2,4-dinitrobenzine (CIMDB), a 
brilliant yellow product. A PCA based on similar structural criteria used to 
develop the DHFR PCA using GCN4 leucine zippers as oligomerization domains 
was thus designed. Cotransformants of zipper-GST-fragment fusions are 
expressed in £ coli on agar plates and colonies are transferred to nitrocellulose 

20 paper. Detection of fragment complementation isdetected in an assay where a 
glutathione-CDNB reaction mixture is applied as an aerosol on the nitrocellUose 
and colonies expressing co-expressed fragments of GST are detected as yellow 
images. 

2) Green Fluorescent Protein (GFP) 

25 GFP from Aequorea victoria is becoming one of the most 

popular protein markers for gene expression. This is because the small, 
monomeric 238 amino-acids protein is intrinsically fluorescent due to the 
presence of an internal chromophore that results from the autocatalytic 
cyclization of the polypeptide backbone between residues Ser65and Gly67 and 

30 oxidation of the - bond of Tyr66. The GFP chromophore absorbs light optimally 
at 395 nm and also possesses a second absorption maximum at 470 nm. This 
bi-specific absorption suggests the existence of two low energy conformers of 
the chromophore whose relative population depends on local environment of the 
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chromophore. A mutant Ser65Thr that eliminates isomerization (single 
absorption maximum at 488 nm) results in a 4 to 6 times more intense 
fluorescence than the wild type. Recently the structure of GFP has been solved 
by two groups, making it a candidate for a structure-based PCA-design, which 
5 has begun to be developped. As with the GST assay, all of the initial 
development is done in £ coli with GCN4 leucine zipper-forming sequences as 
oligomerization domains. Direct detection of fluorescence by visual observation 
under broad spectrum UV light will be used. This system will also betested in 
COS cells, selecting for co-transfectants using fluorescence activated cell sorting 
10 (FACS). 

3) Fire Fly Luciferase 

Firefly luciferase is a 62 kDa protein which catalyzes oxidation 
of the heterocycle luciferin. The product possesses one of the highest quantum 
yields for bioluminescent reactions: one photon is emitted for every oxidized 

15 luciferin molecule. The structure of luciferase has recently been solved, allowing 
for strucutre-based development of a PCA. As with the GST assay described 
herein, cells are grown on a nitrocellulose matrix. The addition ofthe luciferin at 
the surface of the nitrocellulose permits it to diffuse across the cytoplasmic 
membrane and trigger the photoluminescent reaction. The detection is done 

20 immediately on a photographic film. Luciferase is an ideal candidate for a PCA: 
the detection assays are rapid, inexpensive, very sensitive, and utilize a non- 
radioactive substrate that is available commercially. The substrate of luoferase, 
luciferin, can diffuse across the cytoplasmic membrane (under acidic pH), 
allowing the detection of luciferase in intact cells. This enzyme is currently 

25 utilized as a reporter gene in a variety of expression systems. The expression 
of this protein has been well characterized in bacterial, mammalian, and in plant 
cells, suggesting that it would provide a versatile PCA. 

4) Xanthine-guanine phosphoribosyl transferase (XGPRT) 

The E. coli enzyme XGPRT converts xanthine to xanthine 
30 monophosphate (XMP), a precursor of GMP. Because the mammalian enzyme 
hypoxanthine-guanine phosphoribosyl transferase HGPRT can only use 
hypoxanthine and guanine as substrates, the bacterial XGPRT can be used as 
a dominant selection assay for a PCA for cells grown in the presence of 
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xanthine. Vectors expressing XGPRT confer the ability of mammalian cells to 
grow in selective medium containing adenine, xanthine, and rrycophenolic acid. 
The function of mycophenolic acid is to inhibit de novo synthesis of GMP by 
blocking the conversion of IMP into XMP (Chapman A. B. 1983, Molec. Cell. Biol 
5 3:1421-1429). The only GMP produced comes from the conversion of xanthine 
into XMP, catalyzed by the bacterial XGPRT. As with aminoglycoside 
phosphotransferase fragments of XGPRT can be generated based on the known 
structure (see table 1) using the design-evolution strategy described above with 
fragments fused to the GCN4 leucine zippers as a test oligomerizationdomains. 

10 The complementary fusions are cotransfected and the proteins transiently 
expressed in COS-7 cells, or stability expressed in CHO cells, grown in the 
selective medium. In the case of CHO cells, colonies are collected and 
sequentially re-cultured at increasing concentrations of the selective compounds 
in order to enrich for populations of cells that efficiently express the fusions at 

15 high concentrations. 

5) Adenosine deaminase 

Adenosine deaminase (ADA) is present in minute quantities 
in virtually all mammalian cell. Although it is not an essential enzyme for cell 
growth, ADA can be used in a dominant selection assay. It is possible to 

20 establish growth conditions in which the cells require ADA to survive. ADA 
catalyzes the irreversible conversion of cytotoxic adenine nucleosides to their 
respective nontoxic inosine analogues. By adding cytotoxic concentrations of 
adenosine or cytotoxic adenosine analogues such as 9-b-D- 
xylofuranosyladenine to the cells, ADA is required for cell grcwth to detoxify the 

25 cytotoxic agent. Cells that incorporate the ADA gene can then be selected for 
amplification in the presence of low concentrations of 2'-deoxycoformycin, a 
tight-binding transition state analogue inhibitor of ADA. ADA can then be used 
for a PCA based on cell survival (Kaufman et al. 1986. Proc. Nat. Acad. Sci. 
(USA) 83:3136-3140). As with the other systems described above, fragments 

30 of ADA can be generated based on the known structure (see table 1 ) using the 
design-evolution strategy described above with fragments fused to the GCN4 
leucine zippers as a test oligomerization domains. The complementary fusions 
are cotransfected and the proteins transiently expressed in COS-7 cells, or 
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stability expressed in CHO cells, grown in the selective medium containing 2- 
deoxycoformycin. In the case of CHO cells, colonies are collected and 
sequentially re-cultured at increasing concentrations of 2'-deoxycoformycin in 
order to enrich for populations of cells that efficiently ecpress the fusions at high 
5 concentrations. 

6) Bleomycin binding protein (zeocin resistance gene) 

Zeocin. a member of the bleomycin/phleomycin family of 
antibiotics, is toxic to bacteria, fungi, plants, and mammalian cells. The 
expression of the zeocin resistance gene confers resistance to bleomycin/zeocia 

10 The protein confers resistance by binding to and sequestering the drug and this 
preventing its association and hydrolysis of DNA (Berdy, J. 1980, In Amino Acid 
and Peptide Antibiotics, Berdy, ed. Boca Raton. FL: CRC Press, pp.459-497; 
Mulsant et al. 1989, Somat. Cell. Mol. Genet. H:243-252). Bleomycin binding 
protein (BBP) could then be used for a PCA based on cell survival. As with the 

1 5 other systems described above, fragments of ADA can be generated based on 
the known structure (see table 1) using the design-evolution strategy described 
above with fragments fused to the GCN4 leucine zippers as a test 
oligomerization domains. The BBP is a small (8 kD) dimer that binds to drugs 
via a subunit interface binding site. For this reason, the design would be 

20 somewhat different in that first, a single chain form of the dimer would be 
generated by making a fusion of two BBP genes with a short sequence coding 
for a simple polypeptide linker introduced between the two subunits. Fragments 
in this case will be based on a short sequence of one of the subunit modules, 
while the other fragment will be composed of the remaining sequence of the 

25 subunit plus the other subunit. Complementation and selection experiments will 
be performed as described for the examples above using bleomycin or zeocin 
as selective drugs. 

7) Hygromycin-B-phosphotransferase 

The antibiotic hygromycin-B is an aminocyclitol that inhibits 
30 protein synthesis by disrupting translocation and promoting misreading. The £ 
coli enzyme hygromycin-B-phosphotransferase detoxifies the cells by. 
phosphorylating Hygromycin-B. When expressed in mammalian cells, 
hygromycin-B-phosphotransferase can confer resistance to hygromycin-B ( Grite 
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et al. 1983, Gene 25:179-188). The enzyme is a dominant selectable marker 
that could be used for a PCA based on eel! survival. While the structure of the 
enzyme is not known it is suspected that this enzyme is homologous to 
aminoglycoside kinase (Shaw et al. 1993, Microbiol. Rev. 57:138-163). it is 
5 therefore possible to use the combined design/evolution strategy disclosed 
herein to produce a PCA with this enzyme and perform dominant selection in 
mammalian cells with selection at increasing concentrations of hygromycin B. 
8) L-histidinol NAD+oxydoreductase 

The hisD gene of Salmonella typhimuhum codes for the L- 

10 histidinol NAD+oxydoreductase that converts histidinol to histidine. Mammalian 
cells grown in media lacking histidine but containing histidinol can be selected 
for expression of hisD (Hartman et al. 1988, Proc. Nat. Acad. Sci. (USA) 
85:8047-8051). An additional advantage of using hisD in dominant selection is 
that histidinol is itself toxic, inhibiting the activity of endogenous histidyl-tRNA 

15 synthetase. Furthermore, histidinol is inexpensive and readily permeates cells. 
The structure of histidinol NAD+oxydoreductase is unknown and so developmert 
of a PCA based on this enzyme is based entirely on the exonuclease 
fragment/evolution strategy, disclosed above. 

The following Table list alternative embodiments using other 

20 PCA reporters. Abreviations in Table: Type: D f dominant selection marker R, 
recessive selection marker. Structure: four letter codes= Protein Data Bank 
(PDB) entries; K, known but not deposited in PDB; U, unknown, mono/oligo: M 
monomer: D, dimer; tetra, tetramer. 
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TABLE 1. A list of Other Potential PCA Reporter Candidates 
A-Assays based on Dominant or Recessive Selection 



Enzyme 


T>pc 


Structur 
e 


Size 


monu/ 
oligo 


oeiciiiun urugs/v^onuiHons 


DHFR 


R/D 


many 


!8kD 


•VI 


methotrexate/trimethoprim 


Adenosine deaminase 


OR 


I ADD 




M 


Xyl-A or 

adenosine.alanosine, and 2'- 
deoxy co form vein 


Thymidine kinase 


D'R 


I KIN 




D 


gangcyclovene. HAT 


Mutant hypoxanthine- guanine 


D 


IHGM 




D 


HAT t thymidine kinase 


phosphoribosy! transferase 












Thymidylate synthetase 


R 


INJE 


35kd 


M 


2 fluorodeoxyuridine 


Xanthine-guanine 


D 


INUL 






mycophenolic acid with 


phosphoribosyl transferase 










limiting xanthine 


Glutamine synthetase 


R 


2LGS 








Asparagine synthetase 


R 


U 






B-aspartyl hydroxamate or 
albizin 


Puromycin A'-acetyltransferase 


D 


U 


23kD 


M 


puromycin 


rvrn lnogiycosiue 


n 


fx 


35kD 


M 


neomycin G4I8 nentamvein 

■ i w v/ n i j v f i u \jn i \j « iLM i lui 1 1 y will 


phosphotransferase 












nygromycm o 


n 


1 1 




M 


hviiromvcin B 


phosphotransferase 














D 


u 


46kD 


M 


histidinol 


oxidoreductase 












Rlpomvrin hinHino nrnrpin 




K 


8kD 


D 


bieomycin/zeocin 


Cytosine methyl-transferase 


R/'D 


U 






5«Azacytidine (5-aza-CR) 
and 5-aza-2-deoxycytidine 


06-aJkylguanine 


D 


IADN 






. V-m eth v I-A'-n i trosourea 


a I ky I tran s feras e 












Giycinamide ribonucleotide 


R 


IGRC 


23.2kD 


D 


dideazatetrahydrofolate. 


transformvlase 










minus purine 


Giycinamide ribonucleotide 


R 


U 


45.9kD 




minus purine 


synthetase 












Phosphoribosyl-aminoimidazole 


R 


u 


36.7kD 




minus purine 


synthetase 












Formylglycinamide rtbotide 


R 


u 


l4l.4kD 


M 


L-azaserine. 6-diazo-5-oxo- 


aminotransferase 










L-norleucinc. minus purine 


Phosphoribosyl-aminoimidazole 


R 


u 


39.5kD 


D 


minus purine 


carboxylase 












Phosphoribosyl-aminoimidazole 


R 


u 


57.3kD 




minus purine 


carboxamidc formvltransferase 












Fattv acid svnthase 


R 




272kD 


D 


cerulenin 
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IMPdehvdroaenase 


R 


IAK5 


55.4kD 


Tetra 


mycophenolic acid 















ii- Viral Plaque Assays 



Enzyme 


Type 


Structure 


Size 


Mono/ 
Oligo 


Selection drugs/Conditions 


Thioredoxin 


D 


ITDF 


34.5 
kD 


D 




Reverse transcriptase 


D 


3HVT 








Viral protease 


D 























B-Cell Death Assays 



Enzyme 


Type 


Structur 
c 


Size 


Mono/ 
Oligo 


Selection drugs/Conditions 


^ySieinC pruicaac. pupaiu 


□ 


ISTF 


38.9 
kD 


M 


inhibited by cystatin 


fuetoirija nrntPHCP' mcnaQf 1 

v^ysieine pruicaoc. la^paac 




ICP3 


17kD 
+ 

I2kD 


HeteroD 


inhibited by DEVD-aldehyde 
(can also by used in a 
fluorimetric or colorimetric 
assay, in vitro) 


Metalioprotease: 
carboxvpeptidase 


D 




47.1 
kD 


M 


inhibited by methyl-ethyl 
succinic acid 


Serine protease: proteinase 
K 


D 


IPTK 


30.6 
kD 


M 


inhibited by serpins 


Aspanic protease: pepsin 


D 


IPSN 


34.5 
kD 


M 


inhibited by pepstatin A (can 
also be used in an fluorimetric 
assay, in vitro) 


Lysozyme 


D 


many 


23.2 
kD 


M 


inhibited by N- 

acetvlclucosamine trisaccharide 


RNAse 


D 


many 


13.3 
kD 


M 


inhibited by RNAse inhibitor 


DNAse 


D 


IDNK 


61.6 
kD 


M 


inhibited by actin 


Phospholipase A2 


D 


IP2P 


13.8 
kD 


M/D 


many inhibitors: bromophenacyl 
bromide, hexadecyl- 
trifluoroethyl-glycero- 
phosphomethanol, bromoenol 
lactone, etc. 


Phospholipase C 


D 


IAH7 


28kD 


M 


many inhibitors: neomycin, 
chelervthrine. U73 1 22. etc. 
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C-Colorimetric/Fluorimetric Assay 



Enzyme 


Structure 


Size 


Mono/ 
Oligo 


Selection drugs/Conditions 


DT-Diaohorase (NAD(P)H- 
(quinone acceptor] 
oxidoreductase) 


IQRD 


26kD 


D 


NADPH-diaphorase stain, inhibited 
by dicumarol. Cibacron blue and 
phenidionc 

Note: can also be used in a cell 
death assay (-mitrobenzimidazole. 
fo example). 


(NAD(P)H-[quinone acceptor] 
oxidoreductase)-2 


isoform of 
IQRD 


2lkD 


D 


NRH-diaphorase stain, inhibited by 
pentahydroxyflavone 


Thermophilic diaphorase 

{ Bacillus siearothermophilus) 




30kD 


M 


NADH-diaphorase stain 


Glutathione-S-transferase 


IGNE 


26kD 
other 
isoform 
of 

28kD 


D 


production of a yellow product by 
the conjugation of glutathione with 
an aromatic substance, chloro 
dinitrobenzene (CDNB) 


Luciferase ! 


1LCI 


62k D 


M 


Ruorometric 


Green-fluorescent protein 


1EMA 


30kD 


M 


Intrinsic fluorescence 


Chloramphenicol 
acetyltransferase 


ICLA 


25kD 


Tri 


Fluorimetric: Bodipy 
chloramphenicol 


Unease 




32kD 


Tetra j 


Fluorometric 


SEAP (secreted form of 
human placental alkaline 
phosphatase) 


IAJA 




M 


CSPD chemiluminescent substrate 


B-Glucuronidase 


IBHG 


7lkD 


Tetra 


Histochemical. fluorometric or 
spectrophotometric assays using 
various substrates such as X- 
GLUC 













D-Heteromeric Enzyme Strategies 



Tyrosinase 




30kD r 


Hetero 


Colorimetric: synthesis of 




!4kD 


M+M 


melanin 



EXAMPLE 4 

Examples of Variants of PCA to detect multiple p rotein/protein-dna/proteh 
RNA/protetn-drua complexes 

While specific examples have only been given for applications 
of PCA to protein-pair interactions, it is possible to apply PCA to multiprotein, 
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protein-RNA. protein-DNA or protein-small molecule hteractions. There are two 
general schemes for achieving such systems. Multi-subunit PCA: Two proteins 
need not interact for a PCA signal to be observed; if a partner protein or protein 
complex binds to two proteins simultaneously, it is possible to detect such a 
three protein complex. A multusubunit PCA is conceived with the example of 
herpes simplex virus thymidine kinase (TK), a homodimer of 40 kD . In this 
conception, the TK structure contains two well defined domains consisting of an 
alpha/beta (residues 1-223) and an alpha-helical domain (224-374). As a test 
system, the Rop1 dinner, a four helix bundle homodimer was used. The two 
fragments of TK are extracted by PCR and subcloned into the transient 
transfection vector pMT3 t the first in tandem to the Adenovirus major late 
promoter, tripartite leader 3' to the first ATG, and the second downstream ofa 
ECMV internal initiation site. Restriction sites previously introduced between the 
first and the last ATG are subcloned into BamHI/ Kpnl and Pstl/EcoRI cloning 
sites downstream of the two ATGs. These are used tosubclone PCR-generated 
fragments of the Rop1 subunits into two different vectors. Subsequently Ltk- 
cells are cotransfected by lipofection with the two plasmids and colonies of 
surviving cells are serially selected in medium containing increasing 
concentrations of HAT(hypoxanthine/ aminopterin/thymidine). Cells that express 
complementary fragments of TK fused to the four Rop1 will proliferate under ths 
selective pressure, or otherwise die. Specific examples of use of this concept 
would be in determining constituents of multiprotein complexes that are formed 
transiently or constitutively in cells. 

The utility of PCA is not limited to detecting protein-protein 
interactions, but can be adapted to detecting interactions of proteins with DNA, 
RNA, or small molecules. In this conception, two proteins are fused to PCA 
complementary fragments, but the two proteins do not interact with each other. 
The interaction must be triggered by a third entity, which can be any molecule 
that will simultaneously bind to the two proteins or induce an interaction between 
the two proteins by causing a conformational change in one or both of the 
partners. Two examples have been demonstrated by the Applicant using the 
mDHFR PCA in £ colL In the first case a natural product, the 
immunosuppressant drug rapamycin, is used to induce an interaction between 
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its receptor FKBP12 and a partner protein mTOR ( mammalian Target of 
Rapamycin). The interaction was detected by cotransformation of DHFR 
fragments fused to FKBP or mTOR intoE. co// grown in the presence or absenoe 
of trimethoprim (as described above) and rapamycin (0- 10 nM). Support of 
growth as detected by colony formation was demonstrated to be completely 
dependent on the addition of rapamycin, suggesting that the mDHFR PCA is 
detecting a rapamycin-induced assembly of a FKBP12-mTOR and subsequent 
reconstitution of DHFR activity. This is one example of ause of the PCA strategy 
to test for small molecules that can induce interactions between proteins. 
General applications could be made for therapeutic development. R>r example, 
screening small molecule combinatorial compound libraries for molecules that 
induce interactions between proteins could be carried out. These molecules may 
inhibit the activities of either or both of the proteins, or activate specific cellular 
processes that are initiated by other events, such as growth factor-mediated 
receptor dimerization. The discovery of such small mdecules could lead to the 
development of orally available drugs for the treatment of a broad spectrum of 
human diseases. 

Another example of an induced interaction that was studied 
with the DHFR PCA is the hteraction of the oncogene GTPase p21 ras and its 
direct downstream target, the serine/threonine kinase raf. This interaction only 
occurs when the GTPase is in the GTP-bound form, whereas turnover of GTP 
to GDP leads to release of the complex. As with the FKBP-mTOR complex.this 
induced interaction in £ coli could be demonstrated. PCA could be used in a 
general way to study such induced interactions, and to screen for compounds 
that release or prevent these interactions in pathological states. The ras-raf 
interaction itself could be a target of therapeutic intervention. Oncogenic forms 
of ras consist of mutants that are incapable of turning over GTP and therefore 
remain continuously associated with activated ras. This leads to a constitutive 
uncontrolled growth signal that results, in part, in oncogenesis. The identification 
of compounds that inhibit this process, by PCA, would be of value in broad 
treatment of cancers. Other examples of multimolecular applications of PCA 
could include identification of novel DNA or RNA binding proteins. In its simplest 
conception, a skilled artisan could use a known DNA or RNA binding motifs, for 
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instance a retinoic acid receptor zinc finger, or a simple RNA binding protein 
such as IF-1 , respectively. One half of the PCA could consist of the DNA or RN\ 
protein binding domain fused to one of the PCA fragments (control fragment). 
The complementary fragment would be fused to a cDNA library. A third entity, 
the gene encoding a sequence containing an element known to bind to the 
control protein, and then a second putative or known regulatoryelement is coded 
for after this sequence could be used. A test system consists of tat/tar elements 
that control elongation in transcription/translation of HIV genes. An example of 
an application would be the identification of tat bnding elements that have been 
proposed to exist in eukaryotic genomes and may regulate genes in the same 
or similar way to that of HIV genes. (SenGupta et al. 1 996. Proc. Natl. Acad. US* 
§:8496-8501). 

EXAMPLE 5 

examples of pca applicatio n tn dma screening; Screening combinatorial 

libraries for compounds th a t inhibit or induce protejn-protein/protejp- 
RNA/protein -DRNA complexes 
ft) Drug screening. 

The PCA strategy can be directly applied to identifying 
potential therapeutic molecules contained in combinatorial libraries of organic 
molecules. It is possible to perform high throughput screening of such libraries 
to screen for compounds that will inhibit or induce protein-protein interactions or 
protein-nucleic acid interactions (as discussed above). In addition it is also 
possible to screen for compounds that inhibit enzymes whose substrates are 
other proteins, DNA. RNA or carbohydrates, as discussed below. In this 
application, proteins that interact/protein substrate pairs, or control DNA/RNA 
binding protein-enzyme pairs are fused to PCA complementary fragments and 
plasmids harboring these pairs are transformed/transfectedinto a cell, along with 
any third DNA or RNA element as the case requires. TransformeaTtransfected 
cells are grown in liquid cultures in multiwell plates, each well being inoculated 
with a single compound from an array of combinatorially synthesized 
compounds. A readout of a response depends on the effect of a compound. If 
the compound inhibits a protein interaction, there is a negative response (no 



SUBSTITUTE SHEET (RULE 26) 



WO 00/07038 



PCT/CA99/00702 



56 

PCA signal is the positive response). If the compound induces a protein 
interaction, the response is a positive PCA signal. Controls for non-specific 
effects of compounds include: 1) demonstration that the compound does not 
effect the PCA enzyme itself (test against cells transfected with the wild-type 
intact enzyme used as the PCA probe) and in the case of a cell survival assay, 
that the compound is not toxic to the cells that have not been 
transformed/transfected. As well as providing a high throughput assay for 
biological activity of compounds, PCA also offers the advantage over in vitro 
assays that it is a test for cell membrane permeability of active compounds. 
Specifically demonstrated examples of PCA for drug screening from the 
Applicant include the application of DHFR PCA in £ coli for the detection 
compounds that inhibit therapeutically relevant targets, such as Bax/Bcl2, 
fkbp12/tor, ras/raf, the carboxy-terminal dimerization domain of HIV-1 capsid 
protein. IkB kinase IKK-1 and IKK-2 dimerization domains (leucine zippers and 
helix-loop-helix domains). In each case, the two proteins are subcloned 5' 
upstream of either F[1,2] or F[3] as described above. Plasmids harboring the 
complementary fragments are cotransformed into BL21 cells. Colonies from 
minimal medium plates containing IPTG and trimethoprim are picked, and grown 
in liquid medium under the same selective conditions and frozen stocks made. 
For a single screening cycle, a priming overnight culture is grown from frozen 
stocks in LB medium. A selective minimal medium containing trimethoprim, 
ampicillin, IPTG is aliquated at 25 ml into each well of a 384 well plate. Each well 
is then inoculated with 1 ul of an individual sample from a compound array 
(ArQule Inc.) to give a final concentration of 10 uM. Each well is then inoculated 
with 2 mi of overnight culture and plates are incubated in a specially adapted 
shaker bath at 37 X. At 2 hour intervals, plates are read on an optical 
absorption spectroscopic plate reader coupled to a PC and spreadsheet software 
at 600 nm (scattering) for a period of 8 hours. Rates of growth are calculated 
from individual time readings for each well and compared to a standard curve. 
A "hit" is defined as the identification of an individual compound which reduces 
the rate of growth to less than the 95 % confidence interval based on the 
standard deviation for growth rates observed in all of the wells within the test 
plate. "Near hits" are defined as those cases where growth rates are within the 
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95 % confidence interval For each of the hits or near hits, the following cortrols 
are then performed: The same experiment is performed with BL21 cells that aie 
transformed with empty vector (and no trimethoprim), with vector harboring the 
full length mDHFR gene, or with cotransfected cells where protein expression 
is not induced by IPTG. If in all of these cases the compound has no effect, it 
can be concluded that it is specifically disrupting the protein-protein interaction 
being tested. Such validated hits or near hits are then retested to establisha 
dose-response curve for the individual compound, with concentrations varying 
from 1 pM up to 1 pM by orders of magnitude of 10. The PCA strategy for 
compound screening can also be applied in the multiprotein protein-RNA/DNA 
cases as described above, and can easily be adapted to the DHFRor any other 
PCA in £ coli or in yeast versions of the same PCAs. Such screening can also 
be applied to enzymes whose targets are other proteins or nucleic acids for 
known enzyme/substrate pairs or to novel enzyme substrate pairs identified as 
described below. 

Proteins involved in viral integration processes are examples 
of targets that could be tested for inhibitors using the PCA strategies. Examples 
for the HIV virus and accessory proteins of HIV as a therapeutic target have 
been given in Example 2. 

Other general targets for drug screening could include 
proteins linked to neurodegenerative diseases, such as alpha-synuclein. This 
protein has been linked to early onset of Parkinson's disease and it is present 
also implicated in in Alzheimer's disease. Another example is P-amyloid proteins, 
also linked to Alzheimer's disease. 

An example of protein-carbohydrate interactions that could be 
a target for drug screening includes the selectins that are generally mplicated in 
inflammation. These cell surface glycoproteins are directly involved in 
diapedesis. 

A number of tumor supressor genes whose actions are 
mediated by protein-protein interactions could be screened for potential anti- 
cancer compounds. These include PTEN, a tumor supressor directly involved 
in the formation of harmatomas, in inherited breast and inthyroid cancer. Other 
interesting tumor supressor genes include p53. Rb and BARC1 . 
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EXAMPLE 6 

Examples of applications of the PCA strategy to detect enzyme/substrate 
interactions 

The examples described above are used for identifying novel 
molecular interactions involving molecules that merely bind to each other. 
However detecting the substrates of enzymes is also fully compatible with the 
PCA strategy as shown below: 

i) Enzymes that form tight complexes with protein substrates or induce efficient 
PCA fragment assembly or 

ii) Mutant enzymes that bind tightly to substrates but do not undergo product 
release because of mutations residues involved in nucleophilic attack and/or 
product release (substrate trapping). 

Enzymes may form tight complexes with their substrates (Kd 
-1-10 pM). In these cases PCA may be efficient enough to detect such 
interactions. However, even if this is not true, PCA may work to detect weaker 
interactions. Generally, if the rate of catalysis and product release is slower than 
the rate of folding- reassembly of the PCA complementary fragments, effeclvely 
irreversible folding and reconstitution of the PCA reporter activity will have 
occurred. Therefore, even if the enzyme and substrate are no longer interacting 
the PCA signal can be detected. Therefore, the detection of novel enzyme 
substrates using PCA may be possible, independent of effective substrate Kd or 
rate of product release. In cases where product release is much faster than PCA 
fragment assembly/folding, an alternative approach is provided by generating 
"substrate trapping" mutants of the test enzyme. An example of this approach 
applies to the protein tyrosine phosphatase PTP1B. for which substrate tapping 
mutants have been generated by mutating the nucleophilic aspartate 181 to 
alanine, rendering the enzyme catalytically dead, but capable of forming tight 
complexes with a known substrate, the EGF receptor and other unknown 
proteins (Flint et'al. 1996. Proc. Natl. Acad. USA 94:680-1685). 

An application of using PCA to screen for interacting partners 
of PTP1B is given as follows. First, the aminoglycoside kinase (AK)-based PCA 
in transiently transfected COS or 293 cells is used. The substrate trap mutant 
catalytic domain of PTP1B is fused toAMerminal complementary fragment of AK 
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while a C-terminal fusion of the other AK fragment is made to a cDNA library. 
Cells are co-transfected with complementary AK pairs and grown in selective 
concentrations of G418. After 72 hours, colonies of surviving cells are picked 
and in situ PCR is performed using primers designed to anneal to 3' and 5' 
flanking regions of the cDNA coding region. PCR amplified products are then 5' 
sequenced to identify the gene. 

Enzyme inhibitors Screening combinatorial libraries of 
compounds for those that inhibit enzyme-PROTEIN substrate complexes can 
also be carried out with: 

i) Enzymes that form tight complexes with protein substrates; or 

ii) Mutant enzymes that bind tightly to substrate but do not undergo product 
release because of the mutation. 

EXAMPLE 7 

Applications of the PCA st rategy to protein engineering/evolution The 
PCA strategy can be used to generate peptides or proteins with novel binding 
properties that may have therapeutic value, as is done with phage display 
technology. It is also possible to develop enzymes with novel substrates or 
physical properties for industrial enzyme development. Two detailed examples 
of such applications of the PCA strategy are, with additional applications listed 
below. 

1) Selection of high-affinity, heterodimerizing leucine zipper sequences (J. 

Pelletier, K. Arndt, A. Plueckthun and S. Michnick. manuscript in preparation) 
The mDHFR PCA, described above, was used in a scheme 
for the selection of efficiently heterodimerizing, designed leucine zippers. It has 
been proposed that the formation of salt bridges between positively and 
negatively charged residues at complementing V and *g" positbns is important 
in stabilizing leucine zipper formation, though this view has been contested. In 
order to help define the importance of salt-bridge formation at the e andg 
positions, two leucine zipper libraries were built. Both are based on the GCN4 
leucine zipper sequence, but contain sequence information specific to either Jin 
or Fos zippers in order to create heterodimerizing pairs. As well, the e-1 to e-4 
and g-1 to g-4 positions in each library were randomized to code for positively 
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or negatively charged residues, or neutral polar residues. These libraries were 
amplified by PCR and subcloned into the Z-F[1 .2] or Z-F[3] constructs (described 
above) from which the GCN4 zipper sequences had been removed. The 
bacterial mDHFR PCA selection was performed on selective solid media, as 
described earlier. Colonies were picked and sequenced; sequence analysis 
reveals that the distribution of charged or neutral residues at e-g pairs is not 
random, but is biased toward pairing of opposite charges, orpairing of a charged 
with a neutral residue, rather than same-charge pairing (see figure 7). It was 
reasoned that better zipper pairing should lead to an increase in efficiency of 
DHFR-fragment complementation, resulting in faster bacterial doubling times 
(see Table 1 in the mDHFR PCA description), was thus undertaken and 
undertook a selection/enrichment of the novel zippers relative to GCN4, as 
follows. The designed zipper libraries, expressed as A/-terminal fusions to the 
DHFR F[1,2] or F[3:I1 14A],were cotransformed. clones were picked, propagated 
and mixed in selective liquid culture, and the mix was added in a 1:1 000 000 
ratio to clone Z-F[1,2] + Z-F[3:I114A] (original GCN4 leucine zippers). The 
mixture was propagated in selective liquid culture over multiple passages. 
Restriction analysis shows that within 4 passages, the population of GCN4- 
expressing bacteria is diminishing relative to the novel zipper sequences (data 
not shown), indicating that some of the designed zipper-containing clones are 
propagated at a higher rate than those containing GCN4. Bacteria from later 
passages were plated on selective medium, and individual clones sequenced to 
reveal the identity of the most successful by designed zipper pairs (data not 
shown). 

2) Application of PCA to enzyme function and design 

PCA Development: Adenosine deaminase (ADA) meets all of the criteria for a 
PCA listed above. ADA is a small (-40 kD), and easily purified monomeric zinc 
metallo-enzyme and the structure of murine ADA has been resolved. Several 
in vitro ADA activity assays have been developed, involving UV 
spectrophotometry and stopped-flow fluorimetry. £. coli ADA catalyzes the 
irreversible conversion of cytotoxic adenine nucleosides to non-toxic inosines. 
Eukaryotic or prokaryotic cells propagated in the presence of cytotoxic 
concentrations of adenosine or adenosine analogs require ADA to detoxify these 
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compounds. This is the basis of a dominant-selection strategy used to select fa 
cells expressing a specific gene in mammalian cells. The ADA gene has also 
been expressed in SF3834 £ coli cells which lack a gene coding for 
endogenous ADA. When the gene coding for ADA is introduced into ADA- 
bacterial DNA, those cells that express ADA are able to survive high 
concentrations of added adenosine; those that do not, die . This forms the basB 
of an in vivo ADA activity assay. 

ADA was thus chosen, principally because it can be used as 
a dominant selective marker in mammalian and bacterial cells in which the gere 
has been knocked out. The reason a dominant selective gene was chosen is 
because in screening for novel protein-protein interactions, particularly testing 
for interactions of a known protein against a library of millions of independent 
clones, selection serves to filter for cells that may show a positive response for 
reasons independent with a specific protein-protein interaction. Three test 
systems of interacting proteins including leucine zipper-forming sequences will 
thus be used, the proteins raf andp21 and the induced oligomerization system, 
FK506 binding protein (FKBP) and mTOR that interact through the macrocyclic 
immuno-suppressant compound rapamycin. For all of these systems, an E coli 
construct and mammalian transient transfection plasmids will be used. The test 
proteins will be subcloned as fusions to ADA complementary fragments. The 
primary assay will be survival of SF3834 £. coli cells that have been transformed 
with the complementary ADA fragments fusedto the test oligomerization proteins 
in the presense of toxic concentrations of adensosine. Fusion proteins will then 
be purified from colonies and in vitro assays of ADA activity performed as 
described below. The utility of the ADA PCA as a method to identify novel 
proteins that interact with a test bait will be performed in mammalianCOS-7 and 
HEK-293T cells transiently transfected with FKBP fused to one of the ADA 
fragments and the other fragment fused to a cDNA library from normal human 
spleen containing 10 6 independent clones. As with the E coli assay, cells that 
survive in a medium containing toxic concentrations of ADA are collected and 
isolated plasmids will be tested to identify tie gene for the interacting protein by 
PCR amplification and chain propagation-termination techniques. 
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Structural motifs required for protein function 

Determination of the structural elements required for the 
enzymatic function of ADA are investigated through alteration of the structures 
of the enzyme fragments. At first, ADA is cut into two separate domains - one 
responsible for substrate binding (residues 1-210) and one responsible for 
catalysis (residues 211-352). These separate pieces will be attached to known 
assembly domains, such as leucine zippers (see Example 1 above). 
Reassembly will restore activity which will be assessed through detailed in vitro 
kinetic analysis of the binding and catalytic properties of the re-assembled 
enzyme, using UV spectrophotometry and stopped-flow fluorimetry to observe 
the enzymatic reactions. This system will provide another handle on the 
manipulation of enzyme activity that will afford a powerful tool for enzymatic 
mechanism study. For example, the difference in the kinetic behaviour of the 
reassembled enzyme on mixing with the substrate, compared to enzyme 
reassembled in presence of substrate (where substrate may already be bound 
by binding domain) will allow a sophisticated level of study of the importance of 
binding energy to catalysis. Subsequent point mutations to the functional or 
assembly domains of the proteins will then allow a very subtle perturbation and 
detailed quantification of the relationship of binding energy to catalysis. This 
precise control over the structure and assembly of separate functional domains 
of the enzyme will permit very sophisticated enzymatic structure function studies 
the definition of structural motifs and an understanding of their role in catalysis. 
Novel protein catalyst design 

The detailed knowledge of the enzyme mechanism gained 
through determination of the structural requirements for catalysis will then be 
exploited through the combination of these functional "building blocks" with the 
functional motifs responsible for substrate binding and catalysis in other 
enzymes, allowing the generation of novel protein catalysts. For example, the 
catalytic motif from ADA is modified to a cytidine-binding motif, creating a novel 
enzyme with potentially useful catalytic properties. The activity of these novel 
enzymes can easily be assessed through in vivo assays similar to that of the 
PCA system, or through in vitro activity assays. Furthermore, the detailed 
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mechanistic investigation of the resulting enzymes, possible with this system, wil 
permit the rational design of each subsequent generation of catalysts. 

EXAMPLE 8 

Examples of applications of the PC A strategy to detect molecular 
interactions in whole organisms 

The use of the PCA to detect molecular interactions in whole 
organisms is a logical extension of the PCA applications described above. 
Whole model organisms such as drosophila, nematodes, zebra fish and puffer 
fish are non-limiting examples of such organisms. The sole differences with 
other listed examples is that vectors used would need to be different (for 
example retroviral vectors) and that any substrates needed by the PCA would 
need to be bioavailable, or detection would need to be performed in situ. 

EXAMPLE 9 

Examples of a pplications of the PCA strategy to Gene Therapy 

Another important embodiment of the invention is to provide 
a means and method for gene therapy of mammalian disease. Of particular 
interest is the use of PCA therapeutic for treatment of cancer. In one 
embodiment of the PCA gene therapy, a PCA is developed employing fragment 
(modular protein units) derived from a protein toxin for example: Pseudomonas 
exotoxin. Diptheria toxin and the plant toxin gelonin, orother like molecules. For 
therapy of breast cancer for example, a mammalian, retroviral, adenoviral, or 
eukaryotic artificial chromosomal (EAC's) genetic construct is first prepared. The 
construct introduces one fragment of the selected toxin under the control of the 
promoter for expression of the erbBl oncogene. It is well known that the erbEl 
oncogene is overexpressed in breast cancer and adenocarcinoma cells (Slamcn 
et. al., Science, 1989, 244:707 ). The HER2/neu (c-ert>B-2) proto-oncogene 
encodes a sub-class 1 185-kDa transmembrane protein tyrosine kinase growth 
factor receptor, p185 HER2 . Also, the human erbBl oncogene is located on 
chromosome 17, region q21 and comprises 4,480 base pairs and p185 HeR2 
serves as a receptor for a 30-kDa glycoprotein growth factor secreted by human 
breast cancer cell lines (Lupu et al.. Science. 1990,249:1152 ). 
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The transgene is introduced in vivo or ex-wVo into target cells 
employing methods known by those skilled in the art (e.g. homologous 
recombination to insert transgene into the locus of interest via retroviral, 
adenoviral or EAC's). A second genetic construct comprises a fusion gene 
containing a target DNA that encodes an interacting protein that interacts with 
the erbB2 oncogene (discovered by the PCA process described in this invention) 
and the "second" fragment of the toxin molecule. This construct is delivered to 
the patient by methods known in the art. For example, the construct can be 
delivered as shown in U.S. Patent Nos. 5.399.346 and 5,585,237 whose entire 
contents are incorporated by reference herein. Transgene expression of the 
erbB2 oncogene-toxin fragment described will now be under the control of the 
constitutive oncogene promoter. Proliferating tumor ceils will thus produce one 
piece of the toxin attached as a fusion to the enbB2 oncogene. In the presence 
of the second genetic construct expressing the PCA discovered interacting 
erbB2 oncogene "interacting protein - toxin fragment" construct then: erbB2 
oncogene-toxin fragmentA: interacting protein-toxin fragments will be created 
and induce death of target tumor cells through the creation of an active toxin 
through Protein Fragment Complementation and thus provide an efficacious and 
efficient therapy of the disease. 

This can be extended to other diseases and other toxins 
employing techniques described and embodied in this invention. 

EXAMPLE 10 

Examples of applications of the PCA str ategy to detect molecular 
Interactions in vitro 

Any of the PCA strategies described above could be adapted 
to in vitro detection. Unlike the in vivo PCAs however, detection would be 
performed with purified PCA fragmert-fusion proteins. Such uses of PCA have 
the potential for use in diagnostic kits. For example the test DHFR assay 
described above where the interactiing domains are FKBP12 and TOR could be 
used as a diagnostic test for rapamycin concentrations for use in monitoring 
dosage in patients treated with this drug. 
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EXAMPLE 11 

Signaling bv the Erythropoietin Receptor Mediated by a Liqand-induced 
Conformation Change in Constitutive Receptor Dimers 

The instant Example illustrates a fluorescent assay based on 
a dimerization-induced complementation of designed fragments of the enzyme 
murine dihydrofolate reductase (DHFR). The basis for the assay is that 
complementary fragments of DHFR when expressed and reassembled in cells, 
will bind to the high affinity (Kd= 100 pM) fluorescein-conjugated inhibitor 
methotrexate (fMTX) in a 1:1 complex. fMTX is retained in cells bythis complex, 
while unbound is actively and rapidly transported out of the cells . In addition, 
binding of fMTX to DHFR results in an 4.5 fold increase in quantum yield. Bouid 
fMTX, and by inference reconstituted DHFR, can then be monitored by 
fluorescence microscopy, FACS or spectroscopy. Since the complex of fMTX 
with DHFR is 1:1, measured fluorescence can be calibrated to determine 
average numbers of complexes in individual cells or averages in a population of 
cells. To test the allosteric model of receptor activation it was reasoned asthate: 
if the receptor transmembrane domain is separated by the distance observed in 
the crystal structure of unligated EpoR, then DHFR fragments fused to the C- 
terminal of the transmembrane domains will complement only if ligand induces 
the necessary conformation change that allows the fragments to come into 
contact. Furthermore, the absolute regio- and stereospecific requirement that 
fragments be sufficiently close to fold-reassemble into the enzyme three 
dimensional structure means that a false response, merely proximal due to 
interacting proteins, is unlikely. In addition, insertion of flexible linker peptides of 
a critical length between the transmembrane domain and the fragments should 
result in constitutive complementation, insensitiveto ligand. Based on the EpoR 
crystal structure, the minimum length of linker necessary for a constitutive 
response would be 10 amino acids, assuming that the length of an average 
peptide bond is ~4 A and the distance separating the fragments is 82 A. Longer 
linkers should result in complementation, independent of ligand. Linkers of 5, 10 
and 30 amino acids, corresponding to extended lengths of 20, 40, and 120 A, 
respectively, were thus used. 
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CHO DUKX-B1 1 (DHFR ) cells were co-transfected with EpoR 
extracellular and transmembrane domains fused to the variable linkers and one 
of the two DHFR complementary fragments F[1 2] or F[3]. Co-transfectants wee 
selected in nucleotide-free medium (selection for DHFR activity) and in the 
presence of Epo (2 nM) to assure that activated receptor and therefore 
complementation and reconstitution of DHFR activity was present. Fluorescenoe 
microscopy (Fig. 9) of unfixed, co-transfected cells, that had been incubated wih 
fMTX showed high levels of fluorescence (no nuclear fluorescence was 
observed) when cells were pretreated with Epo or with the EpoR agonist peptide 
EMP1 at saturating concentrations. In the absence of ligands, cells transfected 
with EpoR-DHFR fragment fusions connected by 5 amino acid linkers showed 
no fluorescence, compared to non-transfected cells, those with 10 amino acid 
linkers showed a small background of constitutive fluorescence, but those cells 
expressing fusions with a 30 amino acid linker showed the same level of 
fluorescence in the presence or absence of ligands. These results were 
confirmed by FACS analysis (Fig 10A). Again, Epo-induced fluorescence was 
only seen for the 5 and 10 amino acid linked receptor-DHFR fragment fusions, 
but not for the 30 amino acid linker where the level of fluorescence was 
independent of ligand. It has been previously demonstrated that the fMTX 
concentrations in cells directly correlates with the number and activity of DHFR 
molecules. Because of this, it is possible to calculate the average number of 
receptors in the cell population, based on the FACS response. Assuming that 
one EpoR dimer equals one reconstituted DHFR molecule in a 1:1 complexes 
fMTX (22). An average number of receptors in Epo-activated cells of 
approximately 11,000 receptors per cell for the 5 f 10, and 30 amino acid linker 
cases was calculated. The fact that the numbers of activated dimers are 
approximately equal for each construct precludes one obvious problem with this 
strategy. It might be argued that steric hindrance by other proteins at the 
membrane intracellular surface might prevent complementation of the DHFR 
fragments by interfering with simple receptor dimerization in the case of 
receptors fused to the fragments through short linkers. However, if this were the 
case, the constitutive signal seen with the 30 and to some extent with the 10 
amino acid linker would be higher than for the activated 5 amino acid linker. In 
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this case the receptors are at a minimum, within the range of 80 A from each 
other. In addition, the fact that the number of activated receptors is the same in 
all cases suggests that no additional, or the same factors, determine the 
oligomerization state of the receptor for all three cases. 

To test whether the ligand induced responses corresponds to 
the known pharmacological response of EpoR, quantitative FACS analysis of eel 
fluorescence versus ligand concentration (Fig 10B.C) was performed. Both Epo 
and EMP1 showed saturable binding isotherms with K^s 164 pM and 168 nM 
respectively. These values are consistent with previous studies of cellular binding 
constants and demonstrate that the ligand induced response is consistent with 
the proposed model. Further, the results are consistent with a single binding 
constant, typically observed for both Epo and EMP1 binding to receptors 
expressed on a variety of cell types. 

Applicants have shown results consistent with an allosteric 
mechanism of EpoR activation in the case of the extracellular plus 
transmembrane domains alone. To demonstrate that this model applies to the 
complete receptor complex, full length EpoR and JAK2 was coexpressed in 
COS-7 cells fused to the variable linkers and complementary F[1,2] and F[3] 
fragments. Results were identical to those observed in CHO cells for the 
extracellular EpoR domain alone. JAK2 fused to the 5 or 10 linker and F[1,2] or 
F[3] co-expressed with full length EpoR alone gave an induced response to Epo 
or EMP1 (Fig 11A). Co-expressed alone, JAK2-5,10-F[1,2] and JAK2-5,10-F[3] 
showed no constitutive fluorescence, suggesting that they do not interact 
detectably even when transiently overexpressed. Consttutive fluorescence was 
seen when JAK2-5,10-F[1,2] was coexpressed with EpoR-5,10-F[3], consistent 
with previous studies indicating that this interaction is constitutive. However, Ep 
and EMP1 did induce an augmentation of fluorescence in this case, suggesting 
that in the activated state, the complex of EpoR with JAK2 is more stable. 
Coexpression of EpoR-5aa, 10-F[1,2] with EpoR-5aa or 10aa-F[3] also gave a 
constitutive response. These results would appear contradictory to the model, 
but they are not. It was observed, in circulardichroism and NMR studies, that the 
236 amino acid intracellular domain of EpoR is not folded. Taken together with 
the results presented herein, in the fluorescent assay, the intracellular domain 
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acts as a very long linker, resulting in constitutive reconstitution of DHFR. 
Cotransfected EpoR and JAK2 was also shown to function normally with the 
attached F[1 t 2], F[3] fragments. Western blots with anti-EpoR, JAK2 and pTyr 
show that both proteins are expressed in the cells and that both undergo Epo- 
or EMP1 -induced phosphorylation (Fig. 11). 

Based on the results presented here and the structural studies 
of Wilson, et al, an allosteric model of EpoR activation is proposed. Constitutive 
dimers of EpoR bring the JAK2 kinases associated with each monomer 
intracellular domain into contact and allow autophosphorylation and activafon of 
the kinases. This model is not in any way inconsistent wth dimerization models; 
certainly dimerization is required, but not necessarily a sufficient condition for 
receptor activation. However, given the hicji sequence and structural homology 
among the cytokine growth factors it is possible that this model could be 
generalized to this class of receptors. Furthermore, constitutive dimerization of 
the cytokine IL-2 receptor, IL-1 receptor and of epidermal growth factor receptois 
(EGFR) has been detected by quantitative FRET microscopy suggesting thatan 
allosteric mechanism of activation may apply to cytokine and other receptor 
classes. Simple dimerization, dimerization-allostery or different types of 
conformation change in constitutive dimers are also possible models for receptor 
activation. For the structurally and functionally well understood bacterial 
chemotactic Tar receptors, the mechanism of activation also results from a 
conformation change in constitutive dimers or tetramers induced by ligand, but 
the changes are more subtle than those suggested by these results, involving 
possibly, small piston-, or scissor-like motions or helix supercoiling. The insulin 
receptor is also known to be a constitutive disulfide cross-linkeddimer and might, 
as a result, not be capable of the large conformation changes suggested here 
for EpoR. The DHFR PCA strategy presented here would be applicable to testing 
this model in other dimeric or multimeric cytokine receptors or to studying the 
interactions of other membrane or soluble proteins with activated receptor 
complexes. The DHFR PCA could also be used in a FRET strategy with proteins 
fused to fluorescent proteins with complementary absorption-emission spectra, 
such as mutants of the green fluorescent protein. An important advantage of the 
DHFR PCA is the absolute requirement that fragments be sufficiently close to 
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fold-reassemble into the enzyme three dimensional structure. This absolute 
regio- and stereospecific requirement means thata spurious response that might 
occur between proteins that are merely proximal to each other but not forming 
a complex is unlikely. Two other advantages include the fact that DHFR is a 
small, monomelic enzyme. Thus, an observed signal is assured to be due to a 
dimeric interaction. Second, that the stringency of reassembly can be controlled 
directly by the introduction of fragment interface mutations that will prevent 
background reassembly of fragments. However, with sufficient controls, a 
combination of DHFR and other PCAs used in a FRET strategy would provide 
a powerful approach to studies of protein association dynamics throughout the 
cell in localized regions and compartments. 

Of course, numerous membrane receptors can be used when 
screening for agonist and antagonist. The receptors of interest include the 
Erythropoietin receptor as well as the following additional cellular receptors: 
receptor from a member of a protein family selected from the group consisting 
of theTGF-beta, IMGF, FGF/HBGF, chemokine, L-6, LIF/OSM, TNF, MDK/PTN 
families, Mullerian inhibitory substance (MIS), the inhibins (INHAand INHB), the 
bone morphogenic proteins (BMP), the growth development factors (GDF-1, 
GDF-3, GDF-5, GDF-6, GDF-7 and GDF-8), endometrial bleeding associated 
factor (EBAF/Lefty), glial cell line-derived neurotrophic factor (GDNF), nerve 
growth factor (NGF), brain-derived neurotrophic factor (BDNF), neurotrophin-3 
(NT-3), NT-4 and NT-5, fibroblast growth factor-3 (FGF-3), FGF-4 (int-2), FGF-5 
FGF-6 (hst-2), keratinocyte growth factor (KGF/FGF-7), androgen-induced 
growth factor (AIGF/FGF-8), glia-activating factor (GAF/FGF-9), FGF-11, FGF- 
12, FGF-13, and FGF-14, platelet factor 4 (PF4), platelet basic protein (PBP), 
monocyte-derived neutrophil chemotactic factor (MDNCF/IL-8), melanoma 
growth stimulatory activity protein (MGSA), macrophage inflammatory protein 2 
(MIP-2), Mig. chicken 9E3, pig aveolar macrophage chemotactic factor, pre-B 
cell growth stimulatory factor (PBSF), cytokine-induced neutrophil 
chemoattractant-2, IP10, monocyte chemotactic protein 1, (MCP-1), MCP-2, 
MCP-3. MCP-4. MCP-5, MIP-1 -alpha, MIP-1-beta, MIP-1 -gamma, MIP-3-alpha, 
MIP-3-beta, MIP-4, MIP-5, RANTES, SIS-epsilon, thymus and activation- 
regulated chemokine (TARC). eotaxin. I-309, HCC-1/NCC-2, HCC-3, 



SUBSTITUTE SHEET (RULE 26) 



WO 00/07038 



PCT/CA99/00702 



70 

6Ckine/Exodus-2/SLC, thymus -expressed chemokine (TECK), mouse protein 
C10, IL-6, granulocyte colony-stimulating factor (G-CSF), and myelomonocytic 
growth factor (MGF), leukemia inhibitory factor (LIF) and oncostatin (OSM), 
tumor necrosis factor alpha (TNF-a), tumor necrosis factor beta (TNF-b/LT-a), 
CD40L CD137L/4-1BBL, CD134L/OX40L CD27L/CD70, FasL CD30L, LT-b, 
TNF-related apoptosis-inducing ligand (TRAIL), macrophage stimulating protein 
hepatocyte growth factor, platelet-derived growth factor, insulin-like growth 
factor, platelet-derived endothelial cell growth factor, IL-1a,IL-1b, IL-2, IL-3, IL-4, 
IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11JL-12, IL-U, IL-14. IL-15, IL-16, IL-17 and 
IL-18. Other receptors include nuclear receptors or coactivators such as the 
vitamin D receptors, retinoid receptors, steroid receptors and gamma PPAR 
receptors. 

As shown above, the instant invention allows: 

1) the detection of protein-protein interactions in vivo or in vitro. 

2) the detection of protein-protein interactions in appropriate contexts, such as 
within a specific organism, cell type, cellular compartment, or organelle. 

3) the detection of induced versus constitutive protein-protein interactions (sudi 
as by a cell growth or inhibitory factor). 

4) to distinguish specific-versus non-specific protein-protein interactions by 
controlling the sensitivity of the assay. 

5) the detection of the kinetics of protein assembly in cells. 

6) screening of cDNA libraries for protein-protein interactions. 

Further aspects of the invention can be demonstrated by 
identifying novel interactions with the enzyme p70S6k, to determine its* 
regulation and how separate signaling cascades converge on this enzyme. 

The PCA method is particularly useful for detection of the 
kinetics of protein assembly in cells. The kinetics of protein assembly can be 
determined using fluorescent protein systems. 

In a further embodiment of the invention, PCA can be used fa 
drug screening. The techniques of PCA are used to screen for drugs that block 
specific biochemical pathways in cells allowing for a carefully targeted and 
controlled method for identifying products that have useful pharmacological 
properties. 
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Although the present invention has been described herein 
above by way of preferred embodiments thereof, it can be modified, without 
departing from the spirit and nature of the subject invention as defined in the 
appended claims. 
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WHAT IS CLAIMED IS: 

1. A method employing a Protein Complementation 
assay/Universal Reporter System (PCA/URS) for detecting and screening for 
agonists and antagonists of a cellular receptor, which method comprises: 

a) generating a first nucleic acid vectorencoding a first fusion 
product comprising: 

i) a first fragment of a first PCA/URS reporter molecule, 

and 

ii) a second molecule, fused to said first fragment, which 
comprises a first subdomain of a cellular receptor molecule of interest; 

b) generating a second nucleic acid vector encoding a second 
fusion product comprising: 

i) a second fragment of said first PCA/URS reporter 

molecule, and 

ii) a third molecule, fused to said second fragment, whidi 
comprises a second subdomain of said cellular receptor, and where said second 
subdomain may be the same as said first subdomain in the case of a 
homodimeric cellular receptor, or different from said first subdomain in the case 
of a heterodimeric cellular receptor; or a receptor coactivator or a protein; 

c) transfecting prokaryotic or eukaryotic cells with said first 
and second nucleic acid vectors; 

d) testing said transfected cells for the PCA/URS reporter 
activity, said activity indicating reassociation of the first and secondfragments of 
the PCA/URS molecule mediated by the interaction of said first and second 
subdomains of the cellular receptor molecule: said association being indjced by 
binding said receptor to cognate ligand. 

2. A method employing a Protein Complementation 
Assay/Universal Reporter System (PCA/URS) for detecting and screening for 
agonists and antagonists of a cellular receptor, which method comprises: 
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a) generating a first nucleic acid vectorencoding a first fusion 
product comprising: 

i) a first fragment of a first PCA/URS reporter molecule, 

and 

ii) a second molecule, fused to said first fragment, which 
comprises a first subdomain of a cellular receptor molecule of interest; 

b) generating a second nucleic acid vector encoding a second 

fusion product comprising: 

i) a second fragment of said first PCA/URS reporter 

molecule, and 

ii) a third molecule, fused to said second fragment, which 
comprises a second subdomain of said cellular receptor, and where said second 
subdomain may be the same as said first subdomain in the case of a 
homodimeric cellular receptor, or different from said first subdomain in the case 
of a heterodimeric cellular receptor, 

c) transfecting prokaryotic or eukaryotic cells with said first 
and second nucleic acid vectors; 

d) obtaining a clonal population of cells that express said first 
and second fusion products; and 

e) testing said transfected cells for the PCA/URS reporter 
activity, said activity indicating reassociation of the first and secondfragments of 
the PCA/URS molecule mediated by the interaction of said first and second 
subdomains of the cellular receptor molecule; said association being indiced by 
binding said receptor to cognate ligand. 

3. The method of claim 2, further comprising the step of 
treating said clonal population of cells with a chemical composition prior to said 
testing of the cells for PCA/URS activity, thus measuring the ability of the 
chemical composition to induce or inhibit the activity. 
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4. The method of claim 3. wherein said chemical composition 
is an individual compound or a mixture of compounds obtained from a chemical 
compound library or combinatorial chemical synthesis. 

5. The method of claim 2, wherein said reporter molecule is 
a multimeric protein. 

6. The method of claim 2, wherein said reporter molecule is 
a multimeric receptor. 

7. The method of claim 2. wherein said reporter molecule is 
a multimeric binding protein. 

8. The method of claim 2, wherein said reporter molecule is 
a catalytic molecule. 

9. The method of claim 2, wherein said reporter molecule is 
an energy transfer molecule. 

10. The method of claim 3 ? wherein said reporter molecule is 
a multimeric protein. 

11. The method of claim 2, wherein said reportermolecule is 
a fluorescent, luminescent or phosphorescent protein. 

12. The method of claim 2, wherein said reportermolecule is 
an electron transfer molecule. 

13. The method of claim 2. wherein said reportermolecule is 
a chemiluminescent molecule. 
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14. The method of claim 3, wherein said chemical composition 
is a ligand agonist or antagonist. 

15. The method of claim 3, wherein said chemical composition 

is a nucleic acid. 

16. The method of claim 3, wherein said chemical composition 

is a peptide. 

17. The method of claim 3, wherein said chemical composition 

is a carbohydrate. 

18. The method of claim 3, wherein said chemical composition 
is a natural product or extract. 

19. The method of claim 4, wherein said library of compounds 
is a combinatorial nucleic acid library. 

20. The method of claim 4, wherein said library of compounds 
is a combinatorial carbohydrate library. 

21 . The method of claim 4, wherein said library of compounds 
is a combinatorial peptide or protein library. 

22. The method of claim 3, wherein h the treatment step the 
cells are treated with the chemical composition at dfferent concentrations in the 
medium, and the PCA/URS activity is compared at the different concentrations. 

23. The method of claim 22. wherein the values of POVURS 
activity versus concentration of treatment composition are used to estimate the 
binding isotherm of the composition to the cellular receptor. 
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24. The method of claim 2, wherein the FCA/URS activity is 
detected using a fluorescent assay, and the activity is monitored by fluorescence 
microscopy, fluorescent cell sorting (FACS) or by spectroscopy of aliquots of the 
cells. 

25. The method of claim 22, wheren said reporter molecule 
is dihydrofolate reductase and said detection method comprisestreatment of the 
cells with fluorescein-conjugated methotrexate before monitoring the cellular 
fluorescence. 

26. The method of claim 2, wherein said cellular receptor is 
the Erythropoietin receptor. 

27. The method of claim 2, wherein said cellular receptor is 
a naturally occuring protein which upon binding a ligand induces a cellular 
response. 

28. The method of claim 2, wherein said cellular receptor is 
an enzyme which is activated by binding a ligand. 

29. The method of claim 2, wherein said cellular receptor is 
a natural or synthetic protein which undergoes conformational change or 
oligomerizes upon binding a ligand. 

30. The method of claim 2, wherein said cellular receptor is 
a member of the cytokine receptor superfamily. 

31. The method of claim 2, wherein said cellular receptor is 
the receptor for an interleukin or cytokine. 
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32. The method of claim 2, wherein said cellular receptor is 
a hormone receptor. 

33. The method of claim 2, wherein said cellular receptor is 
a receptor for a member of a protein family selected from the group consisting 
of the TGF-beta, NGF, FGF/HBGF, chemokine, IL-6, LIF/OSM, TNF, and 
MDK/PTN families. 

34. The method of claim 2, wherein said cellular receptor is 
the receptor for a member of the tumor growth factor beta family. 

35. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting ofthe forms of TGF- 
beta, Mullerian inhibitory substance (MIS), the inhibins (INHA and INHB), the 
bone morphogenic proteins (BMP), the growth development factors (GDF-1, 
GDF-3, GDF-5, GDF-6, GDF-7 and GDF-8), endometrial bleeding associated 
factor (EBAF/Lefty), and glial cell line-derived neurotrophic factor (GDNF). 

36. The method of claim 2, wherein said cellular receptor is 
the receptor for a member of the nerve growth factor family. 

37. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of nerve growth 
factor (NGF), brain-derived neurotrophic factor (BDNF), neurotrophin-3 (NT-3), 
NT-4 and NT-5. 

38. The method of claim 2, wherein said cellular receptor is 
the receptor for a member of the fibroblast growth factor and heparin-binding 
growth factor family. 
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39. The method of claim 2, wherein said cellular receptor b 
the receptor for a protein selected from the group conssting of fibroblast growth 
factor-3 (FGF-3), FGF-4 (int-2), FGF-5, FGF-6 (hst-2), keratinocyte growth facta 
(KGF/FGF-7), androgen-induced growth factor (AIGF/FGF-8), glia-activating 
factor (GAF/FGF-9), FGF-11, FGF-12. FGF-13, and FGF-14. 

40. The method of claim 2, wherein said cellular receptor is 
the receptor is the receptor for a member of the chemokine family. 

41 . The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of platelet factor 4 
(PF4), platelet basic protein (PBP), monocyte-derived neutrophil chemotactic 
factor (MDNCF/IL-8). melanoma growth stimulatory activity protein (MGSA), 
macrophage inflammatory protein 2 (MIP-2), Mig, chicken 9E3, pig aveolar 
macrophage chemotactic factor, pre-B cell growth stimulatory factor (PBSF), 
cytokine-induced neutrophil chemoattractant-2, and IP10. 

42. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of monocyte 
chemotactic protein 1, (MCP-1), MCP-2. MCP-3, MCP-4, MCP-5, MIP-1-alpha, 
MIP-1-beta, MIP-1-gamma, MIP-3-alpha. MIP-3-beta t MIP-4, MIP-5, RANTES, 
SIS-epsilon, thymus and activation-regulated chemokine (TARC), eotaxin, I-309I 
HCC-1/NCC-2, HCC-3, 6Ckine/Exodus-2/SLC, thymus -expressed chemokine 
(TECK) and mouse protein C10. 

43. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of fractalkine and 
GCP-2/LIX. 
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44. The method of claim 2. wherein said cellular receptor is 
a member of the group consisting of CXCR-1 , CXCR-2, CXCR-3, CXCR-4, CCR- 
1, CCR-2, CCR-3, CCR-4, CCR-5, CCR-6, CCR-7, CCR-8, and CX3CR. 

45. The method of claim 2, wherein said cellular receptor is 
the receptor for a member of the interleukin-6 (IL-6) family. 

46. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of IL-6, granulocyte 
colony-stimulating factor (G-CSF), and myelomonocytic growth factor (MGF). 

47. The method of claim 2, wherein said cellular receptor is 
the receptor for a member of the leukemia inhibitory factor and oncostatin family 

48. The method of claim 2, wherein said cellular receptor is 
the receptor for a member of the group selected from leukemia inhibitory factor 
(LIF) and oncostatin (OSM). 

49. The method of claim 2, wherein said cellular receptor is 
the receptor for a member of the tumor necrosis factor family. 

50. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of tumor necrosis 
factor alpha (TNF-a) t tumor necrosis factor beta (TNF-b/LT-a), CD40L, 
CD137U4-1BBL, CD134L/OX40L, CD27UCD70, FasL, CD30L, LT-b and TNF- 
related apoptosis-inducing ligand (TRAIL). 

51. The method of claim 2. wherein said cellular receptor is 
a receptor selected from the group consisting of LIMGFR/p75, CD40, CD137/4- 
1BB/ILA. TNFRI/p55/CD120a. TNFRII/p75/CD120b, CD134/OX40/ACT35. 
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CD27 t Fas/CD95/APO-1, CD30/KM. LT-betaR, DR3/WSL-1/TRAMP/APO- 
3/LARD, DR4, DR5, DcR1/TRID t TR2, GITR and osteoprotegerin (OPG). 

52. The method of claim 2. wherein said cellular receptor is 
the receptor for a member of the midkine and pleiotrophin family. 

53. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of midkine (MK), 
pleiotrophin (PTN), chicken retinoic acid-induced heparin-binding protein (Rl- 
MB), Xenopus pleiotrophic factors alpha-1. alpha-2, beta-1 and beta-2. 

54. The method of claim 2 t wherein said cellular receptor is 
a member of the family of G-protein-coupIed receptors. 

55. The method of claim 2, wherein said cellular receptor is 
a receptor for transferrin. 

56. The method of claim 2, wherein said cellular receptor is 
a receptor for a member of the group consisting of macrophage stimulating 
protein, hepatocyte growth factor, platelet-derived growth factor, insulin-like 
growth factor and platelet-derived endothelial cell growth factor. 

57. The method of claim 2. wherein said cellular receptor is 
the receptor for a steroid hormone. 

58. The method of claim 2, wherein said cellular receptor is 
the receptor for an eicosanoid hormone. 

59. The method of claim 2. wherein said cellular receptor hss 
been identified from an expressed sequence tag (EST) nucleic acid sequence. 
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60. The method of claim 2, wherein said cellular receptor is 
the receptor for a protein selected from the group consisting of IL-1a, IL-1b t IL-2 
IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11JU2, IL-13, IL-14, IL-15, IL-16, 
IL-17andlL-18. 

61. The method of claim 2 wherein said receptor is a nuclear 
receptor or coativator of said nuclear receptor. 

62. The method of claim 61 wherein said receptor is the 
Vitamin D receptor. 

63. The method of claim 61 wherein said receptor is a Vitamh 
A or a retinoid associated receptor. 

64. The method of claim 61 wherein said receptor is a 

Gamma PPAR. 

65. The method of claim 61 wherein said receptor is a steroid 

receptors. 

66. A method employing a Protein Complementation 
Assay/Universal Reporter System (PCA/URS) for detecting and screening for 
agonists and antagonists of a membrane receptor, which method comprises: 

a) generating a first nucleic acid vector encoding a first fusion 
product comprising: 

i) a first fragment of a first PCA/URS reporter molecule, 

ii) a first linker fused at one end to said first fragment, 
said linker region comprising between 1 and 30 amino acid residues: and 

iii) a second molecule, fused to the other end of said fird 
linker, which comprises a first subdomain of a cellular receptor molecule of 
interest; 
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b) generating a second nucleic acid vector encoding a second 
fusion product comprising: 

i) a second fragment of said first PCA/URS reporter 

molecule, 

ii) a second linker, fused at one end to said second 
fragment, said linker comprising between 1 and 30 amino acid residue; and 

Hi) a third molecule, fused to the other end ofsaid second 

linker, 

which comprises a second subdomain of said cellular receptor, and where said 
second subdomain may be the same as said first subdomain in the case ofa 
homodimeric cellular receptor, or different from said first subdomain in the case 
of a heterodimeric cellular receptor; 

c) transfecting prokaryotic or eukaryotic cells with said first 
and second nucleic acid vectors; 

d) testing said transfected cells for the PCA/URS reporter 
activity, said activity indicating reassociation of the first and secondfragments of 
the PCA/URS molecule mediated by the interaction of said first and second 
subdomains of the cellular receptor molecule. 
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Insertion of fragments into pQE-32 for bacterial screening, 
or pMT3 or Zap Express for eukaryotic screening: 
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Note: for eukaryotic expression, 
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Identification of clones (for library screening only). 

-Propagation of cells (bacterial or eukaryotic) 
-Isolation of plasmid DNA and insert sequencing 
-For bacterial screening only: overexpression and one-step 
purification of fusion products by the hexahistidine tag 
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